Data sources

Publish or Perish is a software program that retrieves and analyzes academic citations from external data sources; it does not include a database of its own.

The currently available data sources are:

Data source Notes
Crossref Freely available
Google Scholar Freely available
Google Scholar Profile Freely available
Microsoft Academic Requires a free subscription from Microsoft
PubMed Freely available
Scopus Requires a free API key from Elsevier
Web of Science Requires a subscription from Clarivate (typically provided by your organisation)
External data import Allows importing of externally obtained data from Web of Science, RefMan, EndNote, and many others.

Please note: We cannot guarantee the continued availability of any of these data sources. The Publish or Perish software has been continuously maintained and updated since 2006, and during that time we have seen many changes to data sources and their content.

Although we have been able to keep Publish or Perish compatible with whatever data sources and information were available at any time, unforeseen circumstances may force us to (reluctantly) abandon one or more data sources.

Your continued support and advocacy of both the Publish or Perish software and the external data sources that we use will help to ensure that you and we can keep using Publish or Perish in the future. Thanks in advance for your support to Publish or Perish.

Which data source to use?

This table provides a quick overview of the advantages and disadvantages of the various data sources. Please note that any restrictions and constraints in these data-sources are inherent to these sources. Publish or Perish doesn't restrict or otherwise change results, it is simply an interface to these data sources.

If you require specific information about how to use these data sources for author searches, journal searches, general/keyword searches, or affiliation searches refer to the manual pages for this.

Data source Advantages Disadvantages
Crossref
  • No need for a subscription key
  • Usually provides cleaner and smaller number of irrelevant results than Google Scholar and Microsoft Academic
  • Good data source for “any of the words” keyword searches and journal searches by ISSN
  • Search speed fast to medium
  • Includes abstracts in results
  • Typically reports fewer citations than Google Scholar and Microsoft Academic, because it includes fewer journals in some fields and has very limited coverage of books, book chapters and conference papers
  • Only provides a maximum of 200 results
  • The use of NOT or AND in keyword searches results is ignored; Crossref reverts to OR searches
  • Author search is problematic as it is very difficult to disambiguate authors
  • Year of publication is sometimes - but not always - year of online-first, not year of print publication
Google Scholar (GS)
  • No need for a subscription key
  • "Forgiving" search syntax
  • Provides a maximum of 1000 results
  • Usually provides the largest number of publications and citations
  • Search speed medium for single search with limited number of results
  • Includes abstracts in results
  • Usually provides a larger number of irrelevant results than other data sources
  • Author disambiguation more difficult than GSP, MA and WoS
  • Many search results contain truncated data
  • Year of publication is sometimes - but not always - year of online first, not year of print publication
  • Request rate limiter may slow down searches considerably
Google Scholar Profile (GSP)
  • No need for a subscription key
  • Most "forgiving" search syntax
  • Provides a maximum of 1000 results
  • Quick search, 1-3 seconds for most authors
  • Manual curation by the academic usually means cleaner results than GS
  • Only available if academic in question has set up a profile
  • Can contain "dirty data" if academic has not curated their profile
  • Can be manipulated by unscrupulous academics by adding papers not written by the academic themselves
  • Does NOT include abstracts in results
Microsoft Academic (MA)
  • Provides a maximum of 5000 results
  • Usually provides a smaller number of irrelevant results than GS
  • Search speed very fast for searches with less than 500 results, but fast even for repeated searches and searches with many results
  • Usually provides cleaner results than GSP as not all user curate their GSP
  • Seems to provide best automated author disambiguation
  • Includes abstracts in results
  • More restrictive search syntax
  • The use of NOT in keyword searches results in an error message
  • Year of publication is usually year of online first, not year of print publication
  • Requires a (free) Microsoft subscription key
PubMed
  • No need for subscription key
  • Provides a maximum of 1000 results for journal and keyword searches; higher maximum with free NCBI API key
  • Includes abstracts in results
  • Specialized data source for biomedical subjects only
  • Only provides a maximum of 199 results for author and affiliation searches
  • Does NOT include citations; hence citation metrics cannot be calculated
Scopus
  • Provides cleaner and smaller number of irrelevant results than Google Scholar and Microsoft Academic
  • Typically reports more citations than Web of Science
  • Very good data source for keyword searches and journal searches by ISSN
  • Requires a free API key from Elsevier for basic results
  • Only provides a maximum of 200 results. Full results require (non-free) subscription
  • Search speed slow, but still acceptable
  • Typically reports fewer citations than Google Scholar and Microsoft Academic, because it includes fewer journals in some fields and has a limited coverage of books, book chapters and conference papers
  • Does NOT include abstracts in results
Web of Science
  • Provides cleaner and smaller number of irrelevant results than Google Scholar and Microsoft Academic
  • Search speed very fast for nearly any kind of searches
  • Allows wildcards (e.g. global*) for easier searches
  • Very good data source for keyword searches and journal searches by ISSN
  • Includes abstracts in results
  • Requires (non-free) subscription
  • Only provides a maximum of 200 results
  • Typically reports fewer citations than all other sources because it includes fewer journals in many fields (esp. Social Sciences and Humanities) and has a very limited coverage of books, book chapters and conference papers.
  • Typically is last data source to include recent publications as it doesn’t include “in press” papers.

What data sources can PoP search in?