Publish or Perish command line tools

For command line use and advanced users only we provide an edition of the Publish or Perish software that can be used in manual and automated scenarios to perform publication searches.

The use of this software edition requires that you are familiar with command line use on the platform of your choice, in particular with the applicable rules for the quoting of meta characters and other text that has special meaning to the command shell. (If that doesn't mean anything to you, then please use the GUI version of Publish or Perish.)

Note: In contrast to the GUI edition, the command line tool does not have the ability to display Google Scholar CAPTCHAs. If you perform too many queries and Google Scholar requires you to solve a CAPTCHA, the command line search will simply terminate. If that happens, start the GUI edition of Publish or Perish and perform a Google Scholar search to trigger and solve the CAPTCHA, then exit the GUI edition and retry the command line search.

License agreement

Publish or Perish is provided courtesy of Harzing.com. It is free for personal non-profit use; please refer to the End User License Agreement for the full licensing terms and conditions.

How to cite the Publish or Perish software

If you are using the Publish or Perish software in one of your research articles or otherwise want to refer to it, please use the following format:

Harzing, A.W. (2007) Publish or Perish, available from https://harzing.com/resources/publish-or-perish

Download information

The Publish or Perish command line tool is distributed as a compressed archive (.zip or .gz, depending on the platform) from the Harzing.com web site.

For a list of changes in each release, see changes in this version in the documentation.

Platform Download link Version
Windows
(7 and later)
Publish or Perish command line tools 7.27.2949 (2 October 2020)
macOS
(10.13 and later)
Publish or Perish command line tools 7.27.2949 (2 October 2020)
Linux
(x86_64)*
Publish or Perish command line tools
(requires cURL to be installed on your system)
7.27.2949 (2 October 2020)

*The Linux version should run on most x86_64 Linux 5.x kernels, and possibly on earlier ones. The current version was built and tested on Fedora Linux 32 and also tested on Ubuntu Linux 19.10 and 20.04.

Installation

The command line tool does not require any installation beyond extraction from the distribution archive, but see the Configuration section below.

Windows

(Version numbers may vary from the output shown below.)

C:\>unzip pop7tools.zip
Archive:  pop7tools.zip
inflating: pop7error.exe
inflating: pop7metrics.exe
  inflating: pop7query.exe

C:\>pop7query --info
Publish or Perish command line utility 7.21.2818.7447 (2020.05.21.1437) Windows (x86)
(c) 1990-2020 Tarma Software Research Ltd

Built on Windows 10.0.18363 (x64) (Microsoft C/C++ 140050727)
Running on Windows 10.0.18363 (x86) 18362.1.amd64fre.19h1_release.190318-1202

App data key:     Publish or Perish
App data dir:     C:\Users\David\AppData\Roaming\Publish or Perish
Roaming settings: C:\Users\David\AppData\Roaming\Publish or Perish\settings.json
Local settings:   HKCU\Software\Tarma Software Research\Publish or Perish
PoP root:         C:\Users\David\AppData\Roaming\Publish or Perish
PoP results:      C:\Users\David\AppData\Roaming\Publish or Perish\Results6
PoP temp:         C:\Users\David\AppData\Roaming\Publish or Perish\Temp
PoP trash:        C:\Users\David\AppData\Roaming\Publish or Perish\Trash
Log path:         C:\Users\David\AppData\Local\Temp\pop7query.log

macOS

(Version numbers may vary from the output shown below.)

% tar xzf pop7tools_macos.tar.gz
% ./pop7query --info
Publish or Perish command line utility 7.21.2806.7444 (2020.05.18.1515) Darwin (x86_64)
(c) 1990-2020 Tarma Software Research Ltd

Built on Darwin 19.4.0 (x86_64) (clang C/C++ 4.2.1 Compatible Apple LLVM 11.0.3 (clang-1103.0.32.59))
Running on Darwin 19.4.0 (x86_64) Darwin Kernel Version 19.4.0: Wed Mar  4 22:28:40 PST 2020; root:xnu-6153.101.6~15/RELEASE_X86_64

App data key:     Publish or Perish
App data dir:     /Users/dave/Library/Application Support/Publish or Perish
Roaming settings: /Users/dave/Library/Application Support/Publish or Perish/settings.json
Local settings:   /Users/dave/Library/Preferences/Publish or Perish.json
PoP root:         /Users/dave/Library/Application Support/Publish or Perish
PoP results:      /Users/dave/Library/Application Support/Publish or Perish/Results6
PoP temp:         /Users/dave/Library/Application Support/Publish or Perish/Temp
PoP trash:        /Users/dave/Library/Application Support/Publish or Perish/Trash
Log path:         /Users/dave/Library/Logs/pop7query.log

Linux

(Version numbers may vary from the output shown below.)

$ tar xzf pop7tools-linux.tar.gz
$ ./pop7query --info
Publish or Perish command line utility 7.21.2806.7444 (2020.05.18.1612) Linux (x86_64)
(c) 1990-2020 Tarma Software Research Ltd

Built on Linux 5.6.7-300.fc32.x86_64 (x86_64) (GNU C/C++ 10.0.1 20200430 (Red Hat 10.0.1-0.13))
Running on Linux 5.6.7-300.fc32.x86_64 (x86_64) #1 SMP Thu Apr 23 14:13:50 UTC 2020

App data key:     .publish_or_perish
App data dir:     /home/dave/.publish_or_perish
Roaming settings: /home/dave/.publish_or_perish/settings.json
Local settings:   Dummy settings; not persistent
PoP root:         /home/dave/.publish_or_perish
PoP results:      /home/dave/.publish_or_perish/Results6
PoP temp:         /home/dave/.publish_or_perish/Temp
PoP trash:        /home/dave/.publish_or_perish/Trash
Log path:         /home/dave/.logs/pop7query.log

Configuration

Some data sources, for example Microsoft Academic and Scopus, require an API key for their use. The command line edition of Publish or Perish uses the same configuration settings as the GUI edition, so we recommend that you use the GUI version of Publish or Perish to set up the desired API keys before using the command line edition.

If there is no GUI edition of Publish or Perish for your platform, then you can copy the settings.json file from another platform; the configuration settings are platform-independent.

Use the pop7query --info command to see where the settings.json file is stored for the command line tool (and the GUI edition, if any).

Syntax

  • pop7query --info [--all|--crossref|--gsauthor|--gscholar|--gsciting|--gsprofile|--masv2|--pubmed|--scopus|--wos] [outfile]
  • pop7query options [--crossref|--gsauthor|--gscholar|--gsciting|--gsprofile|--masv2|--pubmed|--scopus|--wos] [outfile]

With --info only configuration information about the program is printed; no search is performed. By default only version and path information are included; to see data source settings, use --all for all data sources or specify a data source to see information about that data source only.

Without --info a publication search is performed. By default Google Scholar is used as the data source, but you can specify another data source through the listed data source options.

After the publication search the results are optionally sorted, then written to outfile if specified, else to stdout. Any diagnostic information and all progress information is written to stderr, so you can redirect them separately if desired.

The data output format is the first of:

  1. the format specified by the --format option (see below)
  2. the extension of outfile, if any:
    .csv   CSV (comma-separated values)
    .json  JSON (JavaScript object notation)
    .rtf   RTF (full search report as Rich Text)
    .tsv   TSV (tab-separated values)
  3. the default, which is currently CSV output.

For publication searches, the following query field options are available. Please note that:

  • Not all data sources support all query fields. Unsupported fields are reported, but otherwise ignored.
  • The per-field syntax is similar to the one used in the GUI edition and may include "quotes" and boolean operators such as AND and OR -- but see the next point.
  • If entered on a shell command line (as opposed to a programmatically constructed command line or an argument file used with the -f option, below) you may have to use additional quoting to prevent the shell from eating or misinterpreting "quotes" or other special characters. The details depend on the shell in question, but for most shells enclosing the entire field expression (but not the --option itself) in an additional set of 'single' quotes will be sufficient.

Query fields:

--author authorspec
Specify the author(s) for which to search.
--affiliation affiliation
Specify the author affiliation(s) for which to search. This is not widely supported.
--citedid identifier
Specify the data source's document ID for the document that you want to retrieve the citing references for. This is currently only supported for --gsciting
--field fieldofstudy
Specify the field(s) of study for which to search. This is not widely supported and the field names are often idiosyncratic.
--issn issn
Specify the ISSN for which to search.
--journal journalname
Specify the journal(s) for which to search.
--title words
Specify the title word(s) for which to search.
--keywords words
Specify the keyword(s) for which to search; these may occur in the abstract, full text, or even (also) in the title; the details depend on the data source.
--years from-to
Specify the first and last years of the publications in which you are interested.
--raw syntax
Use syntax verbatim for the query; if this options is used, all other query fields are ignored. The syntax must fit the data source in question.

Executive options:

-f argfile
Read additional options from argfile (one option per line) and process them at that point of the command line. This allows more complicated options to be specified without additional quoting and also allows annotation of options with comments (comment lines start with a # character). Note that you must use an equals sign '=' between the option name and its value when specifying options in the file (this is also allowed, but not required, on the command line). Example file contents:
# This is a comment
--author="a harzing" AND "d adams"
--keywords=publish perish
--years=2000-2018
--dryrun
Perform all actions except submitting the query to the data source. This is useful to check that the configuration is correct.
--syntax
Create an appropriate query, then print the resulting query syntax as it would be sent to the data source. No query is actually submitted.
Note: Some data sources treat year ranges as a separate filtering option instead of a query field. The value of the --years option may therefore be missing from the query syntax; this does not mean that it was somehow "forgotten". It just means that it is a non-syntax parameter for the data source in question.
--direct
Bypass the Publish or Perish cache and submit the query directly to the data source, even if the query would otherwise be satisfied from the cache.
--max number
Retrieve no more than number results, even if more would be available.
Note: Because of the granularity of the request size (typically 10, 20, or 100, depending on the data source), the final number of results may be rounded up to the next multiple of the request size.
--maxage hours
Submit the query directly to the data source if the query's cached data are older than hours hours. This overrides the builtin cache period.
--wait secs
Pause for secs seconds between query requests. This is a simple form of rate limiting and is mainly useful to spread the requests over a longer period than Publish or Perish's builtin adaptive rate limiter would do.

Output options:

--sort [-]author|title|source|year|cites|cites_annual|cites_norm|rank
Sort the output in ascending or descending (if prefixed with '-') according to the given field. The default results order is rank because that is simply the order in which the results are returned by the data source.
--format apa|bibtex|chicago|csiro|csv|harvard|isi|json|mla|ris|rtf|rtfshort|tsv|vancouver
Write the results in the given format. This overrides the default or implied (through the extension of outfile, if any) data format. For further processing the json format is recommended, because that contains the most complete information available.

Miscellaneous options:

--help
Print a syntax summary and exit the program with exit code 0.
--version
Print the program's version and exit the program with exit code 0.

Exit codes

The program returns one of the following exit codes:

0
Success: one or more matches for the query. If --info is specified, this exit code merely means "no errors".
1
An error occurred in the command line parameters or in the program execution generally.
2
Syntax error in the query parameters; this includes missing or empty query fields (but not unsupported fields -- those are simply ignored without generating an error code).
3
The query could not be executed, for example because the data source was unavailable.
4
The query returned no matches.

Support Publish or Perish

The development of the Publish or Perish software is a volunteering effort that has been ongoing since 2006. Download and use of Publish or Perish is and will remain free (gratis), but your support toward the costs of hosting, bandwidth, and software development are appreciated. Your support helps further development of Publish or Perish for new data sources and additional features.