Publish or Perish version 8
Introducing Publish or Perish release 8
We are happy to announce the release of Publish or Perish version 8 today, 1 November 2021, 15 years after the first Publish or Perish software was made publicly available.
The new release consolidates the many new features that we have been adding over the past two years, but the biggest visible difference is the new layout: see the screen shot at the top of this post. With the new layout we have tried to achieve the following:
- A cleaner appearance with a more logical left-to-right flow of information details;
- More details visible for the currently selected paper;
- More space for future developments.
We felt that with the general increase of screen sizes over the past years (the previous layout was introduced with Publish or Perish version 5 in 2016), we could use a bit more horizontal screen space than before without compromising the total amount of information available at a glance.
Functionally Publish or Perish 8 is mostly identical to the latest version 7 release (7.33), but internally we have upgraded the underlying software libraries and made several improvements to data parsing. We have also added early support for Semantic Scholar and made preparations for further new features.
You can download the latest version of Publish or Perish from the following pages:
In a few weeks time existing users of Publish or Perish 7 and earlier will receive update notifications for the new version 8; until then, you must manually download and install Publish or Perish 8 if you want to start using it.
Below is an overview of the many new and improved features that have been added over the past two years and that are now consolidated in the new Publish or Perish 8.
- New data source: PubMed
- New data source: Semantic Scholar
- New metric: hA index
- Ability to search for free full-text version of results
- Download and export of abstracts
- Improved search reports: basic and extended
- Five new Google Scholar search features
- Facilitation of repeated searching
- Publish or Perish command line tools for advanced users
- Diagnostic log to improve transparency and replicability
- Training resources
Since version 6, Publish or Perish includes six data sources: Crossref, Google Scholar, Google Scholar Profiles, Microsoft Academic, Scopus and Web of Science. All six data sources cover the whole spectrum of disciplines, Life Sciences, Natural Sciences, Engineering, Social Sciences, and Arts & Humanities, although at varying levels.
We have now added the first subject-specific data source: PubMed in order to facilitate research related to COVID-19. The PubMed search pane in Publish or Perish allows you to perform a PubMed advanced search and analyse its results; it contains a structured version of the parameters accepted by PubMed.
You can use Publish or Perish to do author, affiliation, journal, ISSN, title words, and keyword searches in PubMed, restricting the results by years if so desired. PubMed offers up to 1,000 results. PubMed doesn’t provide citations, so no metrics can be calculated beyond authors/paper.
However, as can be seen in the screenshot above, it can be useful to find out what a particular university (or author) has published on COVID-19 or any other medical topic. You can make the search as broad or as narrow as you like as PubMed has a fairly flexible search syntax.
Note: Publish or Perish uses the PubMed advanced search to allow users to search for authors, journals, ISSNs, affiliations, title words, and keywords, as well as use year limitations. This mirrors the options available in the other data sources. The results of this search will be different from the results of the general PubMed search box on the website. This searches in all (40) available PubMed fields and thus provides very broad results that need to be filtered manually.
If you are confused about which data source to use, refer to this overview in the Publish or Perish Manual. It provides an overview of the eight data sources you can search through Publish or Perish. For detail on Author searches, Journal searches, General/keyword searches or Affiliation searches please refer to the relevant pages.
Shortly before the public release of Publish or Perish 8 we added another data source: Semantic Scholar. This is still at an experimental stage because the current version of Semantic Scholar only offers a basic version of keyword searching in its API.
The Semantic Scholar team intend to expand the API capabilities over time; when they do, Publish or Perish will be updated to use the new capabilities. In the meantime, feel free to explore this new data source; Publish or Perish contains a free API key for Semantic Scholar, courtesy of the S2 team.
Hardly a month goes by without a new metric being suggested in the bibliometrics literature. We therefore don’t introduce new metrics lightly in Publish or Perish. In Publish or Perish 7 we introduced the ACC [annual citation count] metric listing the number of papers with at least 1, 2, 5, 10 and 20 citations per year.
In Publish or Perish 8, we introduced a new metric correcting for the age of the paper: the hA index. Created by Yves Fassin and published in the ISSI newsletter, this is a h-index with paper citations corrected by the year of publication. Details of its calculation can be found here, but in short it is the largest number of papers in the dataset that have obtained at least hA citations per year on average.
In contrast to the Publish or Perish hIa index, introduced by Harzing, Alakangas and Adams, which divides the (individual) h-index by the total number of years in the data-set, the hA index divides the citation count of each paper by the age of that paper. The hA index can easily be verified manually by sorting the results in Publish or Perish on the citations per year column (see illustration).
Illustration of the hA index
My hA index in Google Scholar is 26, which means I have 26 publications that have at least 26 citations per year. For my hA index to increase to 27, my 2015 publication Why and how does shared language affect subsidiary knowledge inflows in Journal of International Business Studies would need to increase its citations to an average of 27 citations per year.
For it to increase to 28, not only would my 2001 publication in Human Resource Management Who's in charge? need to increase its citations to an average of 28 per years, but the five preceding publications would also need to increase their citations to this same average.
The number of citations per year will decline naturally with each passing year if citations do not accrue at the same rate as before. Hence, the growth of a scholar's hA index typically declines of time and reaches a plateau for most mature scholars. Even maintaining a high hA index is not easy and is a mark of sustained scholarly impact.
Note: The metrics pane was simplified in earlier Publish or Perish versions as many users were confused by the many unfamiliar metrics. However, all metrics that were ever included in Publish or Perish are still available in the exported results. Hence, you can continue any longitudinal research projects that use these metrics.
Search for a free full text version of the results
A key new feature in Publish or Perish 8 is the ability to check for free full-text availability of any of the results. There are two ways to do this, both of which are available through the popup menu in the results list (right-click on a result for the menu) as well as through the new Paper details pane for the selected paper, as shown in the screen shot below.
- Open in full-text if available in the data source that you searched in, greyed out if no full text can be found in the data source in question.
- Find full-text with Unpaywall, searches the Unpaywall database for a full-text option.
Some notes and limitations:
- This feature is not available in Google Scholar Profile searches, but is available in all other data sources, including plain Google Scholar.
- Although in most cases the two options provide the same version of the paper, this is not always the case. You might wish to try both so as to find the “best” version.
- The free full-text version will often be a pre-publication version. If you prefer the officially formatted journal version and your library has access to the journal in question, you might wish to login to your university library and use “open article in browser” instead.
In addition to metrics and full reference details, abstracts are now downloaded by Publish or Perish. Abstracts for each paper are now visible in the new paper details panel, making it very easy to quickly scan articles for relevance.
They are also included with all the other meta-data when you export your results from PoP to a variety of formats, including BibTeX, Endnote, ISI, JSON, RefMan/Ris, and CSV (for importing into databases and spreadsheets), as well a full search report. An overview of all the exporting options can be found in the Publish or Perish manual.
This new feature is ideal for reviewing a set of results in more details, conducting content analyses in a research project, importing abstracts with other bibliographic details in a reference management program, or even creating word clouds from a set of article abstracts.
Note that not all data sources provide (complete sets of) abstracts. Here is a summary:
- CrossRef: provides abstracts for some results only
- Google Scholar: provides the first few lines of the abstract only.
- Google Scholar Profile: no abstracts
- Microsoft Academic: provides abstracts for most results
- PubMed: provides abstracts for all results
- Scopus: no abstracts
- Semantic Scholar: no abstracts
- Web of Science: provides abstracts for all results
Version 8 improves the search report we introduced in version 6. First, you can now create a report in the order in which results are sorted. This means you can create search reports with commonly used ordering systems such as (reverse) chronological/alphabetical. If you are more interested in citation impact, you can order your report by total number of citations or citations per year, listing the most cited papers first. However, you can also use any other order you may see fit such as ordering by journal, publisher or first author.
Second, we now include a choice between two types of reports:
- Basic: search terms, data retrieval, metrics, and basic results, i.e., full reference for all publications and citations per publication (incl. per year).
- Extended: as above, but results are provided in structured format by title with full bibliographic details, citations, the DOI (formatted as URL), and each paper's abstract, if available.
Here are the first two pages of a Basic Search Report, using Anne-Wil's publication record in Microsoft Academic as an example. This option is very handy if you need a concise overview of this information for performance appraisals, funding applications or simply want to copy a list of your publications and their citations to a CV or website.
Using the same data set as above, the first two pages of an Extended Search Report look like this (note the extensive abstracts and extended formatting of each paper's bibliographic details):
Although Publish or Perish includes eight different data sources, many of our users mainly use Google Scholar. In version 8, we therefore made a special effort to ensure we make the most of this data source. As a result, Publish or Perish 8 includes five additional Google Scholar features.
Search for related works
The Publish or Perish interface now includes an “open related works in browser” option. This is visible when you right-click on any result (see screenshot below). It allows you to view works that Google Scholar judges to be related to the result in question. This feature is very useful if you have identified a core article for a specific set of search parameters and would like to find similar articles.
DOIs were already available for most non-Google data sources, but Publish or Perish now also extracts the DOI for Google Scholar results (if present). DOIs are visible in the new Paper details pane and are included in all export options, including the exporting of references (see screenshot below).
Restrict year ranges to allow retrieval of all citing works
You can retrieve citing works of any publication in Publish or Perish by right-clicking on it and selecting the relevant option (see screenshot below). You can do this for a single publication, for all results of a query (this might take a long time as it puts a heavy load on Google Scholar), or for a sub-set of the results (see next section).
Retrieve citing works for a sub-set of the results
If you would like to retrieve citing works for a sub-set of the results, simply select the relevant results and right-click. This may be useful if you would like to establish the citing works for a particular research topic. For instance, I have written nine articles that relate to international mail surveys, focusing on their challenges, response rates, response styles and language effects.
To retrieve all publications citing these articles I first need to select only these nine articles. There are two ways to do this. The easiest way is to first uncheck all results (right-click anywhere in the results panel and select uncheck all) and then simply check only the publications you are interest in. The screenshot below shows the set of results.
Another way to do this would be to select each of the nine articles without first unchecking all results. This might be a quicker way if you are only interested in a few articles. The screenshot below shows what this would look like.
Once you have selected the relevant articles, right-click and click Retrieve Citing Works in Publish or Perish. PoP then retrieve all publications citing one or more of the articles in this set. Publish or Perish automatically de-duplicates citing publications. So if a particular publication cites more than one publication in you set, it will only be shown once.
The screenshot above shows the most highly cited citing works of the relevant set of publications. Once you have retrieved all cited works you can export them in any format for further analysis if so desired.
What if there are more than 1,000 results?
Google Scholar never provides more than 1,000 results. This can create a problem if you want a complete set of citing works for highly cited publications. We therefore added year ranges to retrieve citing works in Publish or Perish. To use this feature, you will need to cancel the search after it has started and include the year ranges at that stage.
This allows you to “partition” your search when there are more than 1,000 citing works. This is very useful if you are doing an analysis of all works citing a seminal contribution in a particular field. Simply copy the search as many times as you need and adapt the year ranges. After that you can aggregate these results again to a single record in Publish or Perish. For details on how to do this see the tutorial page on aggregation.
Obviously, you can also use this if you are interested only in citations in a specific year or for instance citations in the last 3 or 5 years. The screenshot below shows the results for a citing works search for the seven articles I published in the Journal of International Business Studies. I limited the search to citations from 2016 onwards to retrieve only recent citations. The screenshot below shows the first seven of some 1,100 citing works.
Limit number of search results
We added a drop-down box allowing you to restrict the maximum number of results for Google Scholar searches (see screenshot) for individual searches. Seven options are pre-set, but you can type any number in the box. As this is a per search limit, the choice does not carry over to other searches except for the “duplicate the current search” option.
This new option addresses Google Scholar limitations and makes short searches more convenient. The underlying reason is that Google Scholar only provides 10 results at a time, so to get 1,000 results Publish or Perish needs to send 100 subrequests to Google Scholar. This means you will quickly approach the request rate limits set by Google Scholar, which in turn causes Publish or Perish to limit the request rate by pausing longer and longer between requests. This means searching might become quite slow.
If you only need the top-10/20/50 results, limiting the maximum number of results reduces the load on Google Scholar and speeds up your searches. This new feature is also useful if you are still fine-tuning a complex search and don’t need to see all results to establish whether your search is providing the expected output.
Include or exclude stray citations and/or patents
The Publish or Perish interface now includes an option to include or exclude [citation] results and patents in Google Scholar searches (see screenshot above).
What are [citation] records?
[citation] results are records where Google Scholar has found citing works, but has been unable to find the cited work online. For more details see this tutorial page. Often, these [citation] records are what are commonly called “stray citations”, i.e., citations where citing authors have made small mistakes in citing a work. These can be merged into the master record for the work in question, which will potentially increase your h-index. For more details on this see the blogpost: How to merge “stray citation" records?
In most cases though these stray citations only clutter your result. If you do an author search it makes the author’s publication record look very messy, making it difficult to establish an academic’s primary publications, especially if they are recent and not yet highly cited. If you are doing a journal, title, or keyword search these stray citations are often pure “noise” and do not contribute anything useful to your search. So, excluding them makes a lot of sense in many searches.
[citation] records and author searches
However, it is important to note that Google Scholar assigns the [citation] label to any publication where it cannot find the record online, even if the publication in question is very significant and highly cited. This might include many – though not all – non-journal publications [e.g. books, reports]. So, excluding [citation] records might lead to fewer publications. This might lead to an underestimation of an author’s impact, especially in the Social Sciences and Humanities. It might also lead you to miss seminal books if you do a keyword search for a literature review.
In my own publication record excluding [citation] records reduces the number of results from 407 to 183. Most of the excluded publications are pure dross and are not missed. However, my citations are reduced by nearly 15% and my h-index is reduced from 68 to 63 because three highly cited books, plus the Publish or Perish software and the Journal Quality List are excluded. The other metrics are likewise reduced.
So, if you need a complete a complete record for yourself or an academic you are evaluating, including [citation] records might be essential, especially in the Social Sciences and Humanities. If all you are after is an assessment of someone’s journal publications or if you are doing a literature review in disciplines where only journal articles are important, then excluding them will dramatically simplify the output and speed up your search at the same time.
Search partitioning to retrieve all cited works
We have facilitated search partitioning for cited works for searches in all data sources. Many data sources only provide a limited number of results, most typically 200. This means you may not be able to get all results for a particular author, journal, university or set of keywords without partitioning your search by year.
Publish or Perish now makes search partitioning easy: when you click the New button and select Duplicate Current Search, the new search will be pre-set to the same parameters as the original search, so all you have to do is adjust the year range in the new search (see above screenshot).
Repeat searches across universities, authors or journals
Additionally, this feature is also useful for these two use cases:
- If you want to run a large range of searches with broadly the same search parameters and only want to change one parameter for each new search. For instance, you may want to:
- do a search for a particular set of keywords for a list of universities. Using this feature you can duplicate the search and only need to replace the university name.
- do a search for a particular set of journals for a list of authors. Using this feature you can duplicate the search and only need to replace the author name (see screenshot).
- do a search for a particular set of keywords for a list of journals. Using this feature you can duplicate the search and only need to replace the journal name.
- If you want to update a particular search without losing the original result, i.e., do longitudinal searches.
Repeat searches across data sources
Finally, this new feature also makes it easier to repeat the same search across a variety of data sources. If for instance you wanted to repeat an author search done in Scopus in Google Scholar to see whether this results in a larger number of publications and citations, simply click on new and select Google Scholar (see screenshot below).
This will repeat the search with the same search parameters. However, please do note that different data sources might have a slightly different search syntax. In that case you will still need to change the duplicated search. For details on the search syntax in various data sources, see the Publish or Perish manual for author, journal, affiliation and keyword searches.
For command line use and advanced users only we provide an edition of the Publish or Perish software that can be used from the command line in manual and automated scenarios to perform publication searches.
The use of this software edition requires that you are familiar with command line use on the platform of your choice, in particular with the applicable rules for the quoting of meta characters and other text that has special meaning to the command shell.
We improved the diagnostic log allowing the user to see exactly what search strings Publish or Perish sends to the various data sources.This allows advanced users to verify that the searches they are running are translated as they intend. It also allows them to replicate the search on the web platform in question if so so desired.
You will find this log on the Publish or Perish main menu:
- Windows - use the Help > Open Diagnostic Log command from the main menu bar
- macOS - use the Publish or Perish > Open Diagnostic Log command from the main menu bar
In the past two years, we have also improved usability by creating blogposts and web-pages offering a wide range of support resources, as well as several YouTube videos:
- Publish or Perish training resources
- Using Publish or Perish for meta-analyses
- The changing usage of Publish or Perish over the years: where, why, when, what & who?
- New: Publish or Perish now also exports abstracts
- How to use Publish or Perish effectively?
- Video resources for Publish or Perish and related topics
Many hundreds of detail improvements and corrections. The most important ones are mentioned in the Release notes.
Support Publish or Perish
Development of the Publish or Perish software is a volunteering effort that has been ongoing since 2006, regularly adding new features and data sources and expanding use cases and geographical distribution.
To keep Publish or Perish free (gratis) for everyone, your contribution toward the costs of hosting, bandwidth, and software development is appreciated. If you find Publish or Perish useful, then this is your chance to say "thank you" to the developers.
You can support us by buying the Publish or Perish guide or tutorial and/or through a donation. Only one user out of every five thousand contributes (that is, only 0.02% of all users!), so any support is very welcome indeed.
Copyright © 2021 David Adams. All rights reserved. Page last modified on Wed 29 Dec 2021 20:15
Web master of Harzing.com and developer of the Publish or Perish software, among other things. He holds BSc and MSc degrees in Electrical Engineering, a PhD in Operations Research, and likes to watch academic life from a safe distance.