Looking for John Smith? About author disambiguation
Judging by the email support requests I receive and the responses to the Publish or Perish survey, the most common challenge that users experience is getting a "clean" publication record for their author of interest. Now this isn't so much of a problem if you are called Anne-Wil Harzing, Michael Zyphur or Marcel Wissenburg, but it is a problem if you are called Michelle Brown, Prakash Singh or John Smith. You are likely to have many namesakes in academia.
Author disambiguation in ISI
Please note, however, this is not a problem that just occurs with Google Scholar and Publish or Perish. Even commercial databases with very high subscription fees like Thomson Reuters Web of Science (ISI) have problems with this, see e.g.:
- Harzing, A.W. (2015) Health warning: Might contain multiple personalities. The problem of homonyms in Thomson Reuters Essential Science Indicators, Scientometrics, vol. 105, no. 3, pp. 2259-2270. Available online... [Press coverage in The Times and the Times Higher Education].
Smart searching avoids many problems
You can avoid many of these namesake problems by smart searching. There are five simple steps (linked to separate pages of the PoP tutorial) that will cover the majority of problematic searches:
- First of all ensure you put quotes around your search, e.g. "J Smith", not J Smith. If you don't, Google will match the initial anywhere in the author record, so you might get publications by A Smith and J Jones.
- Second, if your author has normally published with multiple initials, e.g. "JK Smith", then use multiple initials.
- Third, if your author as only ever published with one initial, you can exclude namesakes with multiple initials in one fell swoop by excluding "J* Smith", "J** Smith", "J*** Smith".
- Fourth, if your author works in a field where journals typically list full given names, you can simply search for "John Smith".
- If after these steps, you are left with only a few publications that are not relevant, you can simply use selective exclusion to remove them.
Obviously, you can combine several of these steps for the best result.
What if this still doesn't give you the result you want?
The above will give you a good result for many authors, but for some you will still get many irrelevant hits. Hence, you need stronger armory. Below I have listed five more strategies that can be used, each linking to a detailed example in the PoP tutorial.
- Use year restrictions: Useful if you know you author has for instance only published since 2002.
- Use multiple names: Useful if your author has published under multiple names (e.g. maiden/married name, original/anglicized name).
- Exclude co-authors: Useful if your author has only published with a limited number of co-authors, you can then exclude namesakes' co-authors.
- Use research field: Useful if your author has published in designated research field that are likely to appear in their articles.
- Use affiliation: Useful if your author has only work in a limited number of institutions.
New: Google Scholar Profile searches in PoP version 5 and later
Since Publish or Perish version 5 you can search GS Profiles as well as the "raw" Google Scholar data. This means that author disambiguation has been conducted by the author themselves. Thus it is much easier to get a "clean" publication and citation record for authors with common names.
Please note that Google Scholar and GS Profiles are two distinct data sources. Google Scholar contains the "raw" data, Google Scholar Profiles is the profile that is created and curated by the author themselves. In cases where authors do not actively maintain their profile and have chosen for automatic updates, GSP might contain publications that are not authored by the author in question. This is especially common for authors with East Asian names.
Copyright © 2017 Anne-Wil Harzing. All rights reserved. Page last modified on Wed 15 Nov 2017 11:37
Anne-Wil Harzing is Professor of International Management at Middlesex University, London. In addition to her academic duties, she also maintains the Journal Quality List and is the driving force behind the popular Publish or Perish software program.