The myth of the academic superstar - or why name disambiguation is crucial

Shows how a lack of name disambiguation leads to a serious distortion of the Essential Science Indicators ranking of the top-1% most cited academics

© Copyright 2025-2026 Anne-Wil Harzing. All rights reserved. First version, 19 January 2026

Introduction

What would you say if I told you that there are academics who publish 20-40 articles a year in journals listed in the Web of Science, not just as a one off, but every year for a decade?

I guess your answer would depend on which discipline you are in. If you work in the Social Sciences or the Humanities, you will probably respond that such a sustained level of high productivity is unlikely without engaging in some form of fraudulent behaviour. If you work in the (Life) Sciences, you may conclude that the academics in question are workaholic directors of big research laboratories, or participants in one of the many huge research consortia churning out numerous papers with hundreds or even thousands of authors each.

But what would you say if I told you that some academics publish 20-40 articles every month year after year. Regardless of your discipline, I suspect your answer would be that this this is simply impossible without engaging in seriously fraudulent behaviour.

So, what if I told you some academics publish this much not in a year, not in a month, not in a week, but in a single day, every single day for a decade (or more)! And that their outlets were not shoddy predatory journals (see Strange journal invitations popping up in my inbox every day), but official Web of Science listed journals. Hopefully, you would have enough common sense to realise that even with a generous dose of GenAI support and engagement in serious academic misconduct this simply isn’t possible.

So how then is it possible that in the Web of Science Essential Science indicators ranking of the world’s top-1% most cited academics:

  • no less than fifteen academics publish 20-40 articles a day, 365 days a year.
  • the top-100 academics all publish at least 7 articles a day, 365 days a year.
  • more than 2,000 academics publish at least an article a day, 365 days a year.
  • more than 11,000 academics publish at least 100 articles a year, every single year.

Very simple. Most of these academics are not individuals, they are amalgamations of several, and in some cases dozens or even hundreds, of academics with the same name. So, in the remainder of this white paper I will call these academics “academic entities”. The only question that remains is: why has nobody noticed this? In the remainder of this white paper, I will dive into this and related questions. For easy reference a table of contents can be found below.

Table of contents

Introduction

My 2015 research: Health warning, might contain multiple personalities

Enter… Google Scholar Profiles

Health warning revisited - Essential Science indicators 2016-2025

Disciplinary focus: Economics & Business

Can I do anything to prevent disambiguation problems?

Conclusion

Need further help? 

 

My 2015 research: Health warning, might contain multiple personalities…

So... first off, why did nobody notice this? Well, actually… I did. I even published an article about this in Scientometrics more than a decade ago, in which I concluded that the vast majority of highly productive/cited academics in the Essential Science Indicators were in fact “entities” created by faulty amalgamation – caused by a lack of proper name disambiguation – of the records of dozens or even hundreds of individual academics with the same name and initial(s), called homonyms or namesakes. It even got quite a bit of press coverage (see coverage in The Times and Science below).

Below I will present the highlights of this 2015 article, focusing on the distinction between Asian and Western name origins. However, the article itself contains a wealth of other analyses that may be of interest to you.

Methods

I coded author names as Anglo, Chinese (incl. variants used in Singapore, Hong Kong, Taiwan, Malaysia etc.), Japanese, Korean, Indian sub-continent (India, Pakistan, Bangladesh), and Western European (Germanic, Dutch, Italian, French, Hispanic, Swedish, Danish). Names originating in other countries did not occur in multiples of 5 or more, the threshold for coding. It is possible that some of the academic entities named “Lee”, that I coded as Korean, are fully or partly Chinese. Lee is an alternative spelling of the Chinese name Li and is used in some countries outside mainland China where Chinese is spoken. This means that the number of Chinese academic entities – though already very high – might even be underrepresented.

Name origin coding was based purely on the academic’s name, using my in-depth knowledge of naming practices in different countries (see e.g. my 2001 article Who is in Charge?) and a variety of web-based tools. Obviously, the origin of an academic’s name does not necessarily coincide with their nationality; someone with a Chinese name for instance might well have British, Australian, North American or any other nationality. However, my interest in this white paper is not in nationality as such. I simply try to assess the extent to which different name origins suffer from the lack of name disambiguation.

Asian names

In my Scientometrics article, I demonstrated that almost the entire top-1,000 of the most productive academics in the world – according to the WoS Essential Science Indicators – consisted of East or South Asian academics, more specifically: Chinese [62.8%], Japanese [15.4%], Korean [14.7%], or Indian [5.8%]). Academics with Anglo or European names made up less than 2% of the top-1,000 most productive academics.

I also showed that the top-3 most productive academics (Y Wang, Y Zhang, and Y Li) had published an average of nearly 9 papers a day. A detailed investigation of Y Wang showed that:

“he/she published more than 100 papers in each of 73 distinct research areas, ranging from business economics to computer science, biochemistry to astronomy, oncology to materials science, and toxicology to mathematics. Y Wang is affiliated with more than 500 different universities in nearly 100 different countries.”

Moreover, I found that even when investigating a combination of full given name and last name – such as Yang Wang – the amalgamation of many individuals still occurs. Yang Wang:

“holds many appointments [even] within the same university. At Fudan University, he/she is affiliated with, to name just a few, the Department of Mechanical and Engineering Science, the Institute of Genetics, the Department of Neurosurgery, the Department of Anatomy, the Department of Urology, the Department of Pharmacology, the Department of Radiology and Oncology, the Department of Macromolecular Science and the School of Public Health.

Western names

The problem of faulty amalgamation also occurred for common Anglo names such as Brown, Jones, and Smith, as well as for common Western European names such as Andersen/Anderson, Garcia, Jansen/Janson, Martin, Martinez, Rossi, and Schmidt/Schmid/Smit. However, this led to “academic entities” that displayed a productivity that was “only” 1.5-2 times as high as for academics with more unique names. In contrast, academics entities with Asian names had productivity levels that were 5-10 times higher than academics with more unique names, indicating that this group contained more academics per entity.

Within the group of East Asian academics, those with the most common Chinese (Zhang, Wang, Li) and Korean (Kim, Lee, Park) names were more than three times as productive as their country(wo)men with more unique names. As my recent update (see Health Warning revisited below) suggests they were also up to thirty times as productive as academics with more unique names. I thus concluded that it is:

“very likely that most, if not all, of the academics called Wang (or Zhang or Li or Kim or Lee or Park) included in the ESI highly-cited list are in fact amalgamations of multiple academics”.

Caveat: hidden in a non-intuitive place in the helpfile

To be fair, Clarivate does acknowledge the namesake problem in the ESI helpfile under the heading “Name conflation”:

Authors having the same last name and initials may represent multiple individuals. This is especially likely in the case of common surnames. The ability to break the name by field may to some degree disambiguate person X in field Y from person X in field Z; however, keep in mind that a listed name can still represent more than one author within the same field.

In order to find this crucial caveat, however, an ESI user would first of all need to take the initiative to consult the helpfile, something most users of computer programs and websites rarely – if ever – do.

[I should know as I have provided technical support to many thousands of the app. 2 million users of the free Publish or Perish citation analysis software for the last 20 years. Most of their questions can be answered with a simple link to the helpfile/manual or FAQ, both linked prominently in the software. Moreover, in the survey I run for the software only about 8% of the respondents say they use the helpfile/FAQ regularly. But even this is likely to constitute an overestimate of active users of helpfiles as survey respondents are already a select group of pro-active users.]

The likelihood of any user consulting the ESI helpfile is further reduced by the location of the link to the helpfile. Rather than displaying a prominent link in the results area, the link is shown in a very small font size at the top right hand of the page, next to the language choice, not an area that draws immediate intention.

Second, even if the ESI user did consult the helpfile, they would need browse through it and read it systematically. The caveat is listed twice, once under the section “Citation Thresholds” and once under the section “Indicators”, not sections a naive reader would be likely to consult to find out more about name disambiguation. Within the latter section it is found in the subsection “Modify Results”, an even unlikelier place to look.

Third, even if the ESI user would somehow manage to discover the caveat, its wording is so “mild” that readers would be forgiven for interpreting it simply as a minor footnote, something that isn’t terribly relevant in practice.

The helpfile suggests that name conflation may be addressed by breaking down the data by field. However, when I drilled down to a disciplinary level in my 2015 article, I demonstrated that even when the data are segmented by discipline – in this case Chemistry, Physics, Medicine, and Business & Economics – the problem with a lack of name disambiguation persisted.

My plea to take name disambiguation more seriously

I ended the article with the plea for Thomson Reuters to take name disambiguation of non-Western names more seriously and indicated that, with the increasing number of publications by academics with Chinese names, the namesake problem was only likely to get worse.

The year after publication of my 2015 article, the Web of Knowledge was taken over by Clarivate. Clarivate is much more active than Thomson Reuters in improving their product offerings and generally takes feedback from the community seriously. Hence, I assumed that they had tackled this issue too and thought no more about this for more than a decade.

Enter… Google Scholar Profiles

But then, just before the 2025 Christmas break, my Times Higher Education (THE) email alert displayed an article which scolded Google Scholar for leaving cleaning of author profiles to individual academics and blamed the service for distorting “h-index leaderboards”. [Note: it is unclear what the THE author meant with “skewing h-index leaderboards” as Google Scholar doesn’t actually produce rankings of individual academics.]

I have a longstanding interest in both rankings and Google Scholar (see: To rank or not to rank). Hence, I am all too aware that Google Scholar is not a curated academic data-base (see: Is Google Scholar flawless? Of course not! and Sacrifice a little accuracy for a lot more comprehensive coverage). I also know that errors occur, and that some of these errors might distort rankings. Hence, drawing attention to this occasionally can be useful.

However, it struck me as rather odd to imply that:

  • reliance on (lack of) individual cleaning is the (only) cause for name conflation.
  • Google Scholar is unique in conflating academics in their author profiles.
  • Google Scholar is unique in leaving the cleaning of citation profiles up to individual academics.

None of these are true. Both Scopus and the Web of Science – the two other main providers of citation data – create automatic author profiles. These profiles suffer from name conflation too, in the case of the Essential Science Indicators dramatically so. They too leave it (partly) up to individual academics to correct these errors. It thus strikes me as rather odd to make a big fanfare about a few examples of name conflation in Google Scholar – a free service.

Moreover, unlike the automatically generated author profiles in Scopus and the Web of Science, Google Scholar profiles are created by individuals. To me it therefore makes perfect sense to leave the cleaning of these profiles up to individuals too. If you create a Google Scholar profile, it is up to you to keep it clean!

Fortunately, keeping your Google Scholar Profile free of publications by your namesakes is trivially easy. It can even be done without having to rely on error reports to data providers. Just take 20 seconds to log in to Google Scholar and put additions to your profile on manual. The screenshots below illustrate how to do this.

Putting your profile updates on manual means that you will get an email whenever Google Scholar thinks they have found a new article for you. Google Scholar will only add it to your profile if you approve it. You can find a more detailed discussion of the process here:

So, whilst I wasn’t terribly impressed by the THE article, it did prompt me to have another look at the Essential Science Indicators – more than ten years after I first did so – to see whether Clarivate had resolved the earlier problems with name disambiguation. As you may have already guessed from my introduction, they did not. For the full story, we will dive a bit deeper below.

Health warning revisited -
Essential Science indicators 2016-2025

My new analysis was based on the December 2025 version of the Essential Science Indicators. This featured nearly 110,000 researchers in the top-1% most cited academics, compared to just over 80,000 ten years earlier, an increase of around 33%. However, the number of papers published has grown dramatically in the last decade. Whereas in 2015 the top-1% most cited academics published nearly 19 million papers, in 2025 this had increased to 54.5 million, a nearly three-fold increase.

This growth was even starker in the top-100, where the combined number of publish papers  had increased from 1.16 million to 4.58  million, a more than four-fold increase. Yes, you are reading this correctly: a mere 100 “academics” [or rather “academic entities”] collectively published more than 4.5 million papers in the past decade.

As a result, the concentration of papers in the top-100 has increased. In 2015, the top-100 researchers made up 0.12% of the researchers in the Essential Science Indicators for 6.1% of the papers published. In 2025 a mere 0.09% of the researchers in the Essential Science Indicators were responsible for 8.7% of the papers published.

Methods: Ranking by papers vs citations

With regard to name coding, I followed the same principles as for my 2015 article. Here I just add that although the Essential Science Indicators ranking is based on the number of citations, my analyses of researcher productivity were conducted by sorting the list on the number of papers published. However, this changed the academics in the top-100 and top-1000 only marginally.

The top-100 most productive academics included only 6 academics that were not in the top-100 on citations; four of them ranked between #101 and #105 on citations, the remaining two ranked #113 and #118. Likewise, the top-1,000 most productive academics included just over 100 academics who were not in the top-1,000 for citations; all but three of these, however, were in the top 1,500 most cited academics.

Are Chinese academics still topping the ranks?

In my 2015 article, I predicted that – without addressing the lack of name disambiguation – the problem of amalgamated entities was only likely to get worse over the years as the number of publishing academics with Chinese names continued to increase. It certainly did! Whereas in 2006-2015, our top-scorer Y Wang published a “mere” 9 articles a day, in the December 2025 ranking, he/she publishes a staggering 40 articles a day.

In fact, assuming a severely workaholic academic who works 8 hours every single day for 10 years – no time off for weekends, birthdays, weddings, childbirths, let alone holidays – even those hovering in the lower regions of the top-100 still produce nearly an article every hour of the day, with the top dogs publishing five articles an hour. And remember that’s on top of their teaching, administrative, supervision, and other academic duties!

In 2015, I indicated that the top-3 academics “Y Wang, Y Zhang and Y Li make every other academic look like a slacker.” In 2025, our updated Y Wang, Y Zhang and Y Li academic entities make their old incarnations look like slackers! Even their last name namesakes only, of which there are no less than fifty-one in the top-100 most productive academics, are now – with an average of 12 papers a day – exceeding the 2015 top-3 performers.

Nationality composition of the top-100 and top-1,000 in 2025

Back in 2015, the top-100 most productive academic entities showed predominantly Chinese name origins, and 40% of the top-100 was called Wang, Zhang or Li. As Table 1 shows, this trend intensified in 2025. In 2025, no less than 92 academic entities in the top-100 have names of a Chinese origin, with a mere six Korean entities (four Kims and two Lees) and two Indian entities (two Kumars) completing the top-100. The Wang, Zhang, and Li academic entities now make up well over half of the top-100.

Table 1: Name origin of the top-100 most productive academic entities

The situation isn’t much different for the top-1,000. As Table 2 shows, the concentration of academic entities with Chinese name origins in the top1,000 has increased too, growing from 62.8% to a staggering 87.8%. And in 2025, no less a third of the top-1,000 most productive academic entities were labelled Wang, Zhang, or Li.

Table 2: Name origin of the top-1000 most productive academic entities*


* The remaining 0.2% in 2025 is ambiguous as the name Ali could be Arab or Muslim Pakistani or Indian.

The growing number of academic entities with Chinese name origins has suppressed the proportion of academic entities with Indian and Korean names in the top-1,000. For the latter, the number of entities dropped from 147 to 60. Moreover, whereas in 2015 seven different names were represented, all Korean academic entities are now called Kim, Lee or Park. This is not entirely surprising given that half of the Koreans have one of these three family names.

The biggest drop [from 15.4% to 2.1%] came for academic entities with Japanese name origins of which there are now only 21 in the top-1,000, all ranking outside the top-500 with fourteen ranking below the top-800.

However, perhaps the most striking change is that whereas in 2015 there were still some academic entities with Anglo or European name origins in the top-1,000, in 2025 these name origins have disappeared completely. The first European (M Schmidt) and Anglo names (J Smith) are found on #1668 and #1738 respectively; most likely amalgamations of many M Schmidts and many J Smiths.

Impact of amalgamation: quantity over quality

So, what is the impact of this amalgamation – beyond making the first ten thousands of the Essential Science Indicators ranking of academics a bit of a farce? As the comparison of academics with Chinese name origins and non-Chinese origins in Table 3 shows, those with Chinese name origins on average produce far more articles.

 

Table 3: Comparing the average number of papers/citations per paper for various groups in the full sample of 110,000 academics

Whereas the non-Chinese academics in ESI’s top-1% most-cited academics on average produce 24 articles a year, a very high but still achievable number in some disciplines, Chinese academics – or rather academic entities – produce more than seven times as many. However, these articles are, on average, poorly cited. This is because many – if not most – Chinese entries are amalgamations of one or a few high-performing academics with scores of low-performing academics. In fact, the most highly ranked might even be amalgamations of hundreds or even thousands of low-performing academics.

Note that the namesake problem isn’t limited to names of Chinese origin. A comparison of the three most common Chinese, Korean, Indian, and Anglo names shows that each of these groups produces more articles than the full sample, but this tendency is highest for Chinese names and lowest for Anglo names. This indicates that Anglo names suffer fewer cases of inappropriate amalgamation than Asian names. However, in each case of the three most common names, the average citations per paper are much lower than for the full sample indicating that inappropriate amalgamations are very likely to be present for all.

Disciplinary focus: Economics & Business

As I did in my 2015 article, I decided to verify whether focusing on a single discipline would eliminate or at least dramatically reduce the lack of author disambiguation. I decided to focus on Economics & Business for two reasons. First, in my 2015 article it was the discipline with the most varied set of name origins in the top-1,000; traditionally East Asians have not featured prominently in the field. Second, this is my own discipline and that of many of my readers.

There were just over 3,000 academics listed in the top-1% most cited academics in the Web of Knowledge in Economics & Business. On average they published 5 papers per year, which on average received 90 citations (see Table 4). These numbers appear far more reasonable than those for the full sample. Hence, at first glance name disambiguation appears to be less problematic after disciplinary disaggregation.

Table 4: Papers per year and citation per paper for full sample, top-1,000, and top-100

However, these figures were quite different for the top-100/1,000 most productive academic entities. As Table 4 shows on average academics in the top-1,000 published in Economics & Business published nearly a dozen papers a year, every year, while for the top-100 the average was no less than 34 papers a year, every single year.

What is more striking, however, is that these articles are not very highly cited. Whereas, for the overall sample, papers received an average of 90 citations, for the top-1,000 this was only 21.2 and for the top-100 this declined to 16.1. If we take out the top-1,000 from the sample, the number of papers published declines to 2.3 per year and the number of citations increases to 124 per paper, much more reflective of what we would expect from highly productive and highly cited academics in this field.

As before, the reason for the large number of papers combined with low citation levels per paper lies in the fact that most academics in the higher regions of the ranking are not individuals, but academic entities that are composed of multiple individuals conflated into one academic super-star, a problem that is compounded for Asian names.

Table 5 shows the name origin of the most productive academics in Economics & Business. The majority of academics in the top-100 are Chinese, more than three quarters in fact. The remainder are either Korean (all named Kim, Lee, or Park) or Indian. Name conflation seems to be worst for Koreans, reflecting the dominance of these three family names in Korea.

When looking at the top-1,000, the dominance of Chinese name origins is still present with nearly two thirds of the entities in this category. The proportion of Korean entities has declined as most Koreans in the list were already amalgamated in the Kim, Lee or Park containers in the top-100.

Table 5: Name origin of top-100/1,000 most productive entities in Economics & Business

However, the proportion of Indian academics in the top-1,000 has increased when compared to the top-100. Some of these – especially the various Singhs, Guptas, Kumars, Patels, and Khans – will no doubt be amalgamations of multiple academics. Even so, when compared to Chinese and Korean name origins, the lower paper count and higher citation count for Indian names suggest that some are individual academics. The field of Economics & Business has traditionally featured many very accomplished Indian academics, typically working at US universities.

We find much smaller proportions of Western European [8.5%], Anglo [5.3%] and Arabic [2.5%] name origins in the top-1,000, with 4.5% for the “other” category, which includes smatterings of Eastern European, South East Asian, and African name origins. Compared with the Chinese and Korean names, these four categories all combine lower paper counts with higher citation counts, suggesting name conflation is less problematic in these groups. That said, it will no doubt still be present for entities such as J Smith and H Nguyen.

In sum, even though the name disambiguation problem is less prominent when focusing on individual disciplines such as Business & Economics, the top-100 is still likely to contain many entities that are composed of multiple academics. A substantial proportion of the top-1,000 will also be likely to be conflated. As Economics & Business had the most varied set of name origins in my 2015 article, the problem is likely to be worse in other disciplines.

Scopus top 2% most cited

Triangulation across data sources is crucial for reliable bibliometric research. So, I decided to triangulate the ESI disciplinary data with the Scopus 2% most cited academics in Economics & Business. For more information on this ranking see: Highly cited academics in Business & Management over the years.

This ranking applies a far more sophisticated basket of indicators, including e.g. citations to single-authored or first/last authored work, and offering the option to exclude self-citations. It focuses on the academic’s entire career, not just the last 10 years, and includes citations in the last 30-odd years. However, it also provides the total number of papers published, citations to them and the start and end years of publishing for each entry. It thus allows us to make an approximate comparison with the ESI data for both yearly publication volume and citations per paper.

Here I use the latest version of the Scopus ranking, published in August 2025. As academics are listed with full given names in this ranking, name disambiguation is likely to be much less of a problem than in the ESI ranking, which uses initials only. We would thus expect far less extreme differences in terms of the number of papers published and citations per paper between the full sample on the one hand and the top-100/1,000 on the other.

The Scopus ranking lists 2682 academics in Economics & Business. As Table 6 shows, on average they publish round 3 papers per year, and these papers are cited around 103 times each. Although this means fewer papers and more citations per paper than for the ESI ranking, they are not worlds apart. Metrics for the bottom two thirds of the two rankings – shown on the last line of the table – are even more similar, which is remarkable given the differences in methodology.

Table 6: Comparing the ESI and Scopus ranking on papers/year and citations/paper

However, the top-100 and top-1,000 are quite different between the two rankings. The ESI ranking displays a systematically larger number of papers per year in the top of the rankings and a systematically lower number of citations per paper. Both are indicative of significant amalgamation of academics in the higher regions of the ranking. This is accompanied by a dramatic difference in name origins between the two rankings as shown in Table 7.

Table 7: Name origin of the top-100/1,000 most productive academics in Economics & Business [ESI vs. Scopus]

First, whereas in the ESI ranking 100% of the academic entities in the top-100 are Asian (Chinese, Korean, or Indian), in the Scopus ranking the vast majority of the academics in the top-100 are Anglo or Western European. Only 17% are Chinese, Indian, or Arabic. There are no Koreans at all in the top-100, even though the ESI ranking featured 16% Koreans in the top-100. This will come as no surprise to academics working in the field of Economics & Business as the field hasn’t attracted a lot of Korean academics to date.

Similar to the ESI ranking, the proportion of Chinese, Korean, Indian, and Arabic name origins is lower in the Scopus top-1,000 than it is in the Scopus top-100. This means an even larger majority of academics in the top-1,000 are Anglo or Western European. In total less than 12% of the academics in the top-1,000 in the Scopus raking are of Chinese, Indian, Arabic, or Korean name origin, compared to more than 80% in the ESI ranking.

Scopus profiles are by no means perfect and name conflation might still occur on the margins, but they clearly present a more realistic picture of the top-1%/2% most cited academics in the field of Economics & Business than the Essential Science Indicators do.

Can I do anything to prevent disambiguation problems?

Yes, you can! As an individual academic you can help by at least monitoring your Google Scholar Profile, which everyone can create without a paid subscription. Doing so can be considered an ethical obligation for researchers. There might be valid reasons for not creating various online profiles, such as LinkedIn, ResearchGate, ORCID, and Google Scholar Profiles. However, once you have created such a profile, I would argue that it is your responsibility to ensure it is accurate and free of errors.

With regard to Google Scholar profiles this means ensuring that the publications listed on your profile are both complete and accurate. Just like you would not list non-existent degrees or job experiences on your LinkedIn profile, it is an ethical obligation to ensure that your Google Scholar Profile only lists publications and citations that are yours. If you need help with this, refer to the section on Google Scholar Profiles and the section Need further help.

Not everyone has access to Scopus and/or the Web of Science; many universities cannot afford Elsevier's and Clarivate's subscription fees. However, if you do, you may also want to verify that your author profiles in these data sources are correct and up-to-date. For the Web of Science, you will need to claim your profile before you can make changes. Like your Google Scholar Profile, these profiles also provide metrics that may of interest to you. 

disambiguation

Conclusion

In this white paper, I showed that name disambiguation is crucial for reliable bibliometric analyses and especially for rankings of individual academics. This is true for every data source. The Essential Science Indicators ranking of individual academics, however, seems to suffer from a particularly serious lack of name disambiguation for Chinese and Korean name origins. So again, let me call on Clarivate to fix the name conflation problem in their Essential Science Indicators ranking. I am sure it is possible if you put your mind and resources to it!

As Qiu (2008) indicates a lack of name disambiguation might have serious career limiting consequences for Asian academics. As it is difficult to uniquely identify Asian authors, they are less like to be asked to be reviewers or editorial board members or to participate in collaborative research projects. This makes it hard for Asian academics to compete on an equal footing with academics with more unique names.

In the context of the Essential Science Indicators ranking, the namesake problem also disadvantages academics with unique names, who cannot “compete” with super-star academics created by the amalgamation of many namesakes. They are thus ranked much lower in the list of highly cited authors than they should have been, are no longer listed after the increase of Asian name entities, or were never ranked in the list at all, despite being very highly cited.

So, before we start criticising other data sources – such of Google Scholar – for their lack of accuracy, be aware that lack of name disambiguation is a general problem shared to some extent by all data sources. Hence, wherever possible triangulate data across data sources, as I have done in this white paper with Scopus data for Economics & Business. I  have also applied triangulation more extensively in several of my own published articles, including my most highly cited article.

Need further help?

The following resources provide you with more detail about Google Scholar as a data source and demonstrate that correct author attribution is a problem in any data source.

If you need help in keeping your Google Scholar Profile clean, would like to enhance your Google Scholar Profile, or start using it to keep up-to-date with research in your field, these resources might be useful.

References

  • Harzing, A. W. (2001). Who’s in charge? An empirical study of executive staffing practices in foreign subsidiaries. Human Resource Management, 40(2): 139–158.
  • Harzing, A.W. (2015) Health warning: Might contain multiple personalities. The problem of homonyms in Thomson Reuters Essential Science Indicators, Scientometrics, 105(3): 2259-2270.
  • Harzing, A.W.; Alakangas, S. (2016) Google Scholar, Scopus and the Web of Science: A longitudinal and cross-disciplinary comparisonScientometrics, 106(2): 787-804. 
  • Qiu, J. (2008). Scientific publishing: Identity crisis. Nature News, 451(7180): 766–767.

research focus white papers web of science bibliometric accuracy china bibliometric research academic publishing citation analysis author disambiguation homonyms research integrity