Documento1

De Grupos de trabajo Recolecta

Documento facilitado por Alicia López Medina. Fuente: Alma Swan, Key Perspectives


Contenido

Usage reporting and metrics systems

Repository/article level usage reporting Google Analytics

http://www.google.com/analytics/

Google's package for logging usage (page views) of all pages in a website. Not specifically developed for repositories, but a number of repositories use it for reporting usage.

Piwik

http://piwik.org/

Website usage analytical package. Aims to be a rival to Google Analytics.

AW stats

http://awstats.sourceforge.net/

An open source web logfile analyser, measuring page views. Not specifically developed for repositories, but a number of repositories use it for reporting usage.

DSpace Statistics

http://www.eigenfactor.org/

IRS (Interoperable Repository Statistics)

http://irs.eprints.org

Developed in Southampton, usable on EPrints and DSpace repositories. JISC-funded project: output is an API for collecting download data.

OA-Statistik http://www.dini.de/projekte/oa-statistik/

A DINI-funded project with four university project partners. The project's aim is to develop a pilot service for the project partners. It focuses on infrastructural issues rather than developing metrics. The central German OA organisation, OA Network, will collaborate by carrying out various tasks, such as deduplication, and will cooperate with OA-Statistik to provide a common user interface for seaerching repositories and delivering usage statistics.

LogEc http://logec.repec.org/about.htm

LogEc collects usage statistics from 5 collections of economics papers and merges them to produce a monthly report. Measures full-text downloads as well as page/abstract views

Journal usage reporting

COUNTER

Project COUNTER established standard reporting formats for publishers to use when providing journal usage figures to libraries.

SUSHI

SUSHI (Standardized Usage Statistics Harvesting Initiative) provides the means for libraries to merge COUNTER data from different sources to get a better overall picture

Usage Factor

http://www.uksg.org/usagefactors

Joint effort of the United Kingdom Serials Group (UKSG) and COUNTER to establish a Journal Usage Factor based on COUNTER data.


Current projects

MESUR (Metrics from Scholarly Usage of Resources)

http://www.mesur.org/MESUR.html

MESUR is a Mellon-funded project carried out at LANL and was due to end in October 2008. It has already reported on two outputs:

  • A semantic model of the scholarly communication process
  • A large-scale semantic store that relates a wide range of bibliographic, citation and usage data

A further output (not yet reported) is to define and validate a range of usage-based metrics (to include both frequency-based and network-based metrics) and to produce a set of guidelines and recommendations.

eigenfactor.org

http://www.eigenfactor.org/

NOT based on usage data this service is a good example for alternative metrics. In this case eigenvector centrality measure applied to Thomson-ISI citation data.

PIRUS (Publisher and IR Usage Statistics)

http://www.jisc.ac.uk/whatwedo/programmes/pals3/pirus.aspx

PIRUS is a JISC-funded project ending in December 2008. Its aim is to develop COUNTER-compliant usage reports at the individual article level that can be implemented by any entity (publisher, aggregator, IR, etc.,) that hosts online journal articles and will enable the usage of research outputs to be recorded, reported and consolidated at a global level in a standard way. The final report is not yet published.

ROAT(Repository Output Assessment Tools)

ROAT is a project funded by NII(National Institute of Informatics in Japan). It's aim is to develop a tool to produce COUNTER-compliant usage reports at the article level to assess the repository output (downloads) in a standard way. This tool will be able to be applied to almost all the repository platform (which is build on Apache).


Notes

Most systems so far count downloads only, which can give a false picture with respect to usage (e.g. unintentional downloads - double-clicks, forgotten previous downloads, etc).

Web statistics packages are page-aware, not item-aware so it is difficult to attribute specific bibliographic information (e.g. an author) to the data.

Web log analysis systems may miss downloads (e.g. the institutional cache may deliver to a new requester without the request being re-logged; RSS feeds may be aggregated and cached so that requests are served by the aggregator and not by the IR).

No system yet devised to measure actual user behaviour. For example:

  • behaviour after downloading (e.g. read, read/discard, store without reading)
  • readers may use email alerts themselves as a current awareness system without interacting with the IR at all

Issues to flag

  • Granularity of data
  • Aggregation of data (articles to journal level, chapters to book level, authors to department level, etc)
  • Policies on sharing data (between IRs, between aggregators, etc)
  • Data audit and quality assurance; agreed standards
  • Aggregation of data from IRs with data from other sources (arXiv, PMC, publisher data)
  • Deduplication


Further information

Scholze, F. (2007) Measuring research impact in an open access environment. http://elib.uni-stuttgart.de/opus/volltexte/2007/3234/

Merk C and Windisch NK (2008) JISC usage statistics review: final report. http://www.jisc.ac.uk/whatwedo/programmes/digitalrepositories2007/usagestatisticsreview.aspx

DRIVER Guidelines 2.0. Guidelines for content providers: exposing textual resources with OAI-PMH (annex 8, p131). http://www.driver-support.eu/documents/DRIVER_Guidelines_v2_Final_2008-11-13.pdf