An introduction to metrics and their use
Since the start of the twentieth century metrics have become an increasingly important part of both journal publishing and researcher evaluation. This is the first in a short series of posts on metrics, giving an overview of what they measure, why we use them, and some of their issues. I am addressing the metrics in order of age, starting with citations; future posts will discuss usage, article level metrics, and altmetrics.
Historically citation metrics have been the standard, and only, tool available to systematically evaluate journals and articles. This started in the 1960s when Dr. Eugene Garfield founded the Institute for Scientific Information (ISI) by using computers to create the first systematic, multilingual, and multidisciplinary abstracting database. However, citation metrics really came to the fore in the 1970s when the Impact Factor was published as part of the Journal Citation Reports® (JCR). The Impact Factor was initially designed to be used by librarians as an aid to collection management, but over time it has become a proxy value for journal quality.
Conceptually, the Impact Factor is a very simple metric - the average number of citations received by articles in a journal within a timeframe. In more detail, the formula used to calculate the Impact Factor is:
Number of citations received in 2012 to 2010 and 2011 content
Number of articles and reviews published in 2010 and 2011
This simplicity creates a range of issues around the Impact Factor, and I will touch on three of the most important.
Distribution of citations
The Impact Factor is an arithmetic mean and does not adjust for the distribution of citations to articles. One highly cited article can have a major positive effect on the Impact Factor. The most extreme example of this relates to the article “A short history of Shelx,” published in Acta Crystallographica Section A in 2008 which to date has been cited more than 33,000 times. The 2008 Impact Factor of 2.051 did not include citations to this article, whereas citations to the article were included in the 2009 Impact Factor, causing it to rise to 49.926, the second highest in the entire JCR for that year; similarly the 2010 Impact Factor was 54.333. This article was no longer included after this and the 2011 Impact Factor dropped to 2.076. Even in a large, high-prestige journal such as Nature, the top 1% of articles accounted for 6.5% of the citations in the 2012 Impact Factor calculation.
Source vs. non-source items
The methodology used by Thomson Reuters to compile the JCR and Impact Factors was created in the 1970s and the JCR only analyzes citations for journal, year, and volume and the phrase “in print”; no other information is used. This means that the JCR cannot distinguish between citations to articles, reviews, or editorials. So that the Impact Factor doesn’t penalize journals for publishing content such as book reviews, editorials, and news items, which are infrequently cited, these article types are not counted in the denominator of the Impact Factor, but citations to this content are still counted.
There are two issues stemming from this. Firstly, the classification of content is subjective; not every journal has the same content treated the same way and, although Thomson Reuters provides guidance on how they decide what counts, content such as extended abstracts and author commentaries fall into a gray area. Secondly, these free articles are cited so they inflate the Impact Factor without any offset in the denominator of the equation.
Subject areas and research type
Different subject areas have very different citation patterns which are reflected in their Impact Factors. The aggregate Impact Factor of the cell biology category in the 2012 JCR was 5.734, for internal & general medicine it was 3.839, and for mathematics it was 0.716. In the social sciences several psychology categories have aggregate Impact Factors higher than 2, compared to history’s aggregate Impact Factor of 0.344.
This does not mean that cell biology research is better than medicine or history, but is merely a reflection of the differing citation patterns, database coverage and dominance of journals between the disciplines. Differences in Impact Factor also exist between basic, applied, practitioner, and educational journals. For more details on these issues please see Jo Cross’s “Impact Factors back to basics” Editors’ Bulletin article.
Partly because of some of these issues, a large number of other journal-based metrics have been created, based either on the Web of Science or the Scopus databases. The details of the JCR-based metrics have been covered in “Citations and the Impact Factor” and Source Normalized Impact per Paper (SNIPs) in “Are Impact Factors facing the SNIP?”
Table 1 Correlation coefficients of various metrics. 2012 JCR compared against 2012 SJR and 2012 SNIP
Table 1 shows the correlation coefficients of seven journal-level metrics based either on the Web of Science or Scopus. A score of 1 means that they follow exactly the same order; a score of zero means there is no correlation. The metrics fall into three groups: total cites and Eigenfactor, which are not adjusted for journal size; Impact Factor, 5-year Impact Factor, Article Influence Score and SJR which are adjusted for journal size; and finally, the SNIP, which is loosely correlated with the others.
This means that if a journal has a high Impact Factor it is likely to perform well by the related metrics; likewise if a journal has a high total citation count it is likely to do well using the Eigenfactor. These extra journal-level citation metrics do not provide very much extra information, they just give another chance for each journal to claim to be top in its subject.