Health warning: Might contain multiple personalities

Shows how a lack of name disambiguation leads to serious distortion of the Essential Science Indicators

disambiguation

After discussing my first-ever academic article in my first research focus blog, I would like to now present one of my most recent publications.

Are nearly all of the world's 1,000 most cited academics Chinese?

Like my first publication it grew out of disbelief, this time about noticing that nearly the entire top 1,000 most cited academics in Thomson Reuters’ Essential Science Indicators seemed to be Asian academics, most of them Chinese. Although I was familiar with the fact that Chinese academics had been catching up rapidly in the “publication game” I couldn’t quite believe that they had left scholars from all other countries behind. So I started to investigate.

I found that three demographic characteristics that should be unrelated to research productivity – name origin, uniqueness of one’s family name, and the number of initials used in publishing – in fact have a very strong influence on it. In contrast to what could be expected from Web of Science publication data, researchers with Asian names – and in particular Chinese and Korean names – appear to be far more productive than researchers with Western names.

The table below clearly shows the extreme discrepancy between the overall distribution of papers across different countries/name origins and the representation of these name origins in the top 1,000 academics in Thomson Reuters’ Essential Science indicators.

Table 1: Representation of academics with different name origins

Name origin

% of WoS papers at country level

% of top 1,000 ESI academics

Anglo

35.3%

1.1%

Chinese

11.6%

62.8%

European

33.5%

0.8%

Indian

2.7%

5.2%

Japanese

5.0%

15.4%

Korean

2.6%

14.7%

Others

9.5%

0.0%

Academics in Anglo or European countries make up a large share, nearly 70% in fact, of the papers published in the WoS. However, Anglo and European names make up less than 2% of the top 1,000 most productive academics. In contrast academics in Asian countries, who contribute only 22% of the number of papers published in the WoS, make up more than 98% of the top 1,000 most productive academics. Furthermore, for any country, academics with common names and fewer initials also appear to be more productive than their more uniquely named counterparts.

Y Wang: the world’s most productive academic

This appearance of high productivity is indeed nothing more than an appearance, caused purely by the fact that these “academic superstars” are in fact composites of many individual academics with the same name. Looking at the most productive academic in the ESI ranking – Y Wang – clearly shows how the namesake problem distorts the underlying reality.

Y Wang makes every other academic look like a slacker, having published around 30,000 papers in just 10 years, for an average of nearly 9 papers a day. Our most productive academic is also a true homo universalis; he/she published more than 100 papers in each of 73 distinct research areas, ranging from business economics to computer science, biochemistry to astronomy, oncology to materials science, and toxicology to mathematics. Moreover, Y Wang is affiliated with more than 500 different universities in nearly 100 different countries.

However, when we look at the actual list of publications, we discover the secret behind this incredible productivity. Y Wang suffers from a serious case of multiple personalities. It is almost impossible to establish how many academics called Y Wang were amalgamated to create this one academic superstar. However, based on our investigation of on one of the more common Y Wangs - Yang Wang - alone, we estimate Y Wang is likely to be a composite of thousands of academics.

Conclusion

We thus conlude that it is high time that Thomson Reuters starts taking name disambiguation in general, and non-Anglophone names in particular, more seriously. What is urgently needed is not a “patch-up” by adding additional databases to cover “non-standard” publications. Science is a global enterprise and thus requires globally integrated coverage. We already argued before that Thomson Reuters seems to be misunderstanding the Social Sciences (Harzing, 2013b). In this paper we showed that Thomson Reuters also seems to have serious difficulty with non-Western names. Thomson Reuters’ Anglophone, Science-based view of the world might well have been tolerable in the past, but it has long ceased to be acceptable in the twenty-first century.

News coverage

Not surprisingly the story elicited some interest from the general media. Here is the most prominent coverage in the Times and Science.

Drop me a line

A free pre-publication version of this paper is hyperlinked. If you’d like to have an official reprint, just drop me an email.

Related blog posts

Related videos

research focus web of science bibliometric accuracy china bibliometric research academic publishing citation analysis author disambiguation homonyms research integrity