14.2.4 ISI suffers from document type classification problems

In the ISI Web of Knowledge each item is categorized into a particular "document type" category. Overall, there are nearly 40 different document types, but the most frequently used are: "article", "review", and "proceedings paper". This section deals with ISI's frequent misclassification of journal articles containing original research into the review or proceedings paper category.

The ISI Web of Knowledge does not provide a definition of any document type in their helpfile, but in various documents (e.g. Journal Citation Report Quick Reference Card), Thomson contrasts "review articles" with “original research articles”. There is no commonly agreed definition of review articles and different disciplines might value them differently. However, in general parlance review articles are defined as articles that do not contain original data and simply collect, review and synthesize earlier research.

Thomson does not define proceedings papers either, but one can only assume them to be papers published in conference proceedings. Conference proceedings are a very common and respected outlet in some disciplines, such as computer science. However, in Business & Economics they are normally seen as mere stepping stones to future publication in a peer reviewed journal. The more prestigious conferences (such as the Academy of Management and the Academy of International Business) either do not publish proceedings or publish only short abstracted papers.

In general, in most of the Social Sciences neither review articles or proceedings papers would be considered worthy of the quality stamp reserved for an original piece of research published in a peer reviewed journal.

Two examples of this categorisation problem

I will give two specific examples in the field of Management to illustrate the extent of this specific categorization problem. First, Michael Lounsbury, co-editor of Organization Studies, has published seventeen articles in ISI listed journals. However, no less than ten of these seventeen articles are categorized as reviews (6) or proceedings papers (4), leaving him with a much less impressive seven pieces of original research (see left-hand picture below).

Two of Lounsbury's six "review papers" were published in The Academy of Management Journal, a journal that is well-known for only accepting papers that make a very strong original theoretical and empirical contribution. The other four papers were published in Strategic Management Journal, Organization, Organization Studies and Social Forces, all journals that would definitely not publish any articles that simply synthesized previous research. So why were these articles categorized as review papers?

Two of Lounsburys four "proceedings papers" were published in the American Behavioural Scientist, whilst the other two appeared in Accounting, Organization & Society and Journal of Management Studies. Clearly none of these journals would be categorized as collections of conference proceedings. So why were these articles categorized as proceedings papers?

Refine results Refine results #2

My second example concerns Jacqueline Coyle-Shapiro, Senior editor of Journal of Organizational Behavior. She has published 12 articles in ISI listed journals. However, half of them are categorized as proceedings papers (see right-hand picture above). These papers were published in the following journals: Journal of Vocational Behavior (twice), Journal of Applied Psychology, Journal of Organizational Behavior, and Journal of Management Studies (twice). As anyone in the field knows, none of these journals are collections of conference papers. So why were these articles categorized as proceedings papers?

Why does ISI categorize regular journal articles as proceedings papers?

The answer to this question presented itself in an FAQ (Why has the number of articles in the Web of Science gone down and the number of proceedings papers gone up) provided by Thomson Reuters. According to Thomson Reuters a Proceedings Paper is:

a document in a journal or book that notes the work was presented - in whole or in part - at a conference. This is a statement of the association of a work with a conference. Prior to October 2008, these items displayed as "Article" in the Web of Science product.

Indeed, when verifying the “proceedings papers” by Lounsbury and Coyle-Shapiro, I found that the acknowledgements in their articles carried innocent notes such as “A portion of this paper was presented at the annual meeting of the Academy of Management, San Diego, 1998” or “An earlier version of this paper was presented at the Annual Meeting of the Academy of Management, Chicago, 1999” or "This paper builds on and extends remarks and arguments made as part of a 2006 Keynote Address at the Interdisciplinary Perspectives on Accounting Conference held in Cardiff, UK". Most of these papers were published before 2008. Hence ISI seems to have changed these classifications retroactively.

So simply presenting an early version of your ideas in a 10-15 minute (or shorter) slot at a conference or workshop (some of the acknowledgments even referred to small workshops), perhaps attended by less than a dozen people, appears to mean that your paper is downgraded by ISI to be a “conference proceedings paper” even though the conference in question doesn't even publish proceedings?

Does that mean that from 2008 onwards all of the papers published in these top journals are categorized as conference papers? No, this appears to happen only to those papers whose authors were honest enough to acknowledge that early versions of the paper had been presented at a conference, or to papers whose authors were kind enough to thank participants of a particular workshop for their input. A nice reward for being professional and collegial!

This categorization process also appears to shows a rather limited understanding of the review process in top journals in the Social Sciences and Humanities. Yes, early versions of a paper might have been presented at conferences. However, the paper that is subsequently submitted to a journal will normally be vastly different from the paper that was earlier presented at a conference. Conferences and workshops are often used as a means to test and polish ideas. Even if authors submit fairly polished papers to conferences, these papers will still generally need to go through two to four rounds of revisions before they are accepted for the journal.

A longer and more extensive process of revision is likely for the many papers that are not accepted by the first journal approached. As acceptance rates of top journals in this field are well below 10%, the reality is that papers are often submitted to several journals before they get their first revise & resubmit. Maturation of the author(s) ideas, reorientation toward different journals, as well as the review process itself means that virtually every paper published has been substantially revised. Hence, the end-product published by a journal often bears very little resemblance to the paper that was originally presented at a conference, years before publication.

Why does ISI categorize original research articles as review papers?

With the conference proceedings problem "resolved", this leaves us with the puzzling review category. Why are papers that clearly present original research, published in the top journals in our field, categorized as derivative work that synthesizes work of other academics? According to Thomson: simply because they have more than 100 references! No, I am not joking. Thomson says:

In the JCR system any article containing more than 100 references is coded as a review. Articles in "review" sections of research or clinical journals are also coded as reviews, as are articles whose titles contain the word "review" or "overview."

When verifying this criterion for the articles published by Michael Lounsbury, I found Thomson to have applied their criteria absolutely as described. Lounsbury's 2001 Administrative Science Quarterly article has 95 references and is categorized in the "article" document type, thus acknowledging it is original research.

His 2004 article in Social Forces with 101 references is categorized in the "review" document type, even though the paper has sections titled “Theory and Hypotheses” and “Data and Methods”. In addition, the abstract and even the title clearly refer to empirical work. If this scholar wanted Thomson to recognize his work as original, maybe he should have been a bit less conscientious in identifying the contributions of other authors in his literature review?

Thomson does not list any particular rationale for why papers with more than 100 references should be considered to be review articles that do not contain original research. It is true that a “real” review article providing, for instance, a literature review of 30 years of publications in a particular field will tend to have many references.

However, the reverse certainly does not hold true, there are many papers with more than 100 references that are not review articles. One cannot presume that there is a direct relationship between the number of references contained in a paper and its level of originality. Thomson also does not provide any rationale for the seemingly arbitrary cut-off point. Perhaps Thomson simply saw 100 as a nicely convenient round figure?

If for some inexplicable reason one wanted to classify articles as review papers based on the number of references, one should at the very least relate them to the length of the paper. It is one thing to punish a 3-page paper with >100 references, it is quite something else to do the same for a 40-page article.

What caused the increase of proceedings papers and reviews?

This then brings us to our final question. Why has the number of papers categorized as proceedings papers and review articles increased over time? For proceedings papers, the answer to this question is very simple. In 2008 ISI integrated their separate conference proceedings database into the Web of Science in 2008. At that point in time many journal articles were retrospectively categorized as conference papers.

The reason for the increase of review papers is also fairly straightforward: the number of papers with more than 100 references has increased. If we look at one of the very top journals in Management, the Academy of Management Review, we find that whilst in 1990 only 1 out of the 31 published articles was categorized as a review, in 2008 no less than half of the 42 published articles were categorized as reviews (because they had more than 100 references).

With an average of 109 references per article (84 for papers categorized as articles and 134 for papers categorized as reviews) in 2008, it appears to be only a matter of time before ISI will consider none of the research published in AMR to be "original research work". That would be rather a shame as some of the most original and groundbreaking work in the field of Management is published in AMR.

Of course there could be many reasons for the increasing number of references in articles, such as the increasing availability of relevant literature online, the increase of multi-disciplinary research, the ever increasing rigour of the reviewing process, which is likely to make reviewers suggest that additional bodies of literature that should be covered, the increasing tendency for both reviewers and journal editors to ask for additional references to their own work or that of the journal they are reviewer/editor for.

Whatever the reason, it is clear that classifying articles as review articles simply because they reached an arbitrary number of references is inappropriate. More disturbingly, it shows a very limited understanding of the research process in the Social Sciences and Humanities.