Publish or Perish command line tools
Overview of the command line version of Publish or Perish
For command line use and advanced users only we provide an edition of the Publish or Perish software that can be used in manual and automated scenarios to perform publication searches.
The use of this software edition requires that you are familiar with command line use on the platform of your choice, in particular with the applicable rules for the quoting of meta characters and other text that has special meaning to the command shell. (If that doesn't mean anything to you, then please use the GUI version of Publish or Perish.)
Note: In contrast to the GUI edition, the command line tool does not have the ability to display Google Scholar CAPTCHAs. If you perform too many queries and Google Scholar requires you to solve a CAPTCHA, the command line search will simply terminate. If that happens, start the GUI edition of Publish or Perish and perform a Google Scholar search to trigger and solve the CAPTCHA, then exit the GUI edition and retry the command line search.
License agreement
Publish or Perish is provided courtesy of Harzing.com. It is free for personal non-profit use; please refer to the End User License Agreement for the full licensing terms and conditions.
How to cite the Publish or Perish software
If you are using the Publish or Perish software in one of your research articles or otherwise want to refer to it, please use the following format:
Harzing, A.W. (2007) Publish or Perish, available from https://harzing.com/resources/publish-or-perish
Download information
The Publish or Perish command line tool is distributed as a compressed archive (.zip or .gz, depending on the platform) from the Harzing.com web site.
For a list of changes in each release, see changes in this version in the documentation.
Platform | Download link | Version |
---|---|---|
Windows (7 and later) |
Publish or Perish command line tools | 8.8.4384 (6 May 2023) |
macOS (10.13 and later) |
Publish or Perish command line tools | 8.8.4384 (6 May 2023) |
Ubuntu Linux (23.04, x86_64) |
Publish or Perish command line tools (requires cURL to be installed on your system) |
8.8.4394 (10 May 2023) |
Installation
The command line tool does not require any installation beyond extraction from the distribution archive, but see the Configuration section below.
Windows
(Version numbers may vary from the output shown below.)
C:\>unzip pop8tools.zip
Archive: pop8tools.zip
inflating: pop8error.exe
inflating: pop8metrics.exe
inflating: pop8query.exe
C:\>pop8query --info
Publish or Perish publication search utility 8.2.3876.8069 (2022.02.02.0902) WinPosix (x86)
(c) 1990-2022 Tarma Software Research Ltd
Built for WinPosix (x86) using Microsoft C/C++ (140050727)
Running on WinPosix 10.0.19043 (x86) 19041.1.amd64fre.vb_release.191206-1406
App data key: Publish or Perish
App data dir: C:\Users\David\AppData\Roaming\Publish or Perish
Roaming settings: C:\Users\David\AppData\Roaming\Publish or Perish\settings.json
Local settings: HKCU\Software\Tarma Software Research\Publish or Perish
Root path: C:\Users\David\AppData\Roaming\Publish or Perish
Results path: C:\Users\David\AppData\Roaming\Publish or Perish\Results6
Temp path: C:\Users\David\AppData\Roaming\Publish or Perish\Temp
Trash path: C:\Users\David\AppData\Roaming\Publish or Perish\Trash
Log path: E:\Temp\pop8query.log
macOS
Note: Due to stricter notarization requirements, the command line tools for macOS are now distributed as a standard macOS installer package called pop8mactools.pkg. The command line tools will be installed to the /usr/local/bin folder, which is normally part of your local ${PATH}.
(Version numbers may vary from the output shown below.)
% open pop8mactools.pkg
...follow the installer's instructions...
% pop8query --info
Publish or Perish publication search utility 8.2.3877.8069 (2022.02.02.1035) MacOS (arm64)
(c) 1990-2022 Tarma Software Research Ltd
Built for MacOS (arm64) using clang C/C++ (Apple LLVM 13.0.0 (clang-1300.0.29.30))
Running on Darwin 21.3.0 (arm64) Darwin Kernel Version 21.3.0: Wed Jan 5 21:37:58 PST 2022; root:xnu-8019.80.24~20/RELEASE_ARM64_T8101
App data key: Publish or Perish
App data dir: /Users/dave/Library/Application Support/Publish or Perish
Roaming settings: /Users/dave/Library/Application Support/Publish or Perish/settings.json
Local settings: /Users/dave/Library/Preferences/Publish or Perish.json
Root path: /Users/dave/Library/Application Support/Publish or Perish
Results path: /Users/dave/Library/Application Support/Publish or Perish/Results6
Temp path: /Users/dave/Library/Application Support/Publish or Perish/Temp
Trash path: /Users/dave/Library/Application Support/Publish or Perish/Trash
Log path: /Users/dave/Library/Logs/pop8query.log
Linux
(Version numbers may vary from the output shown below.)
$ tar xzf pop8tools_linux_subtype.tar.gz
$ ./pop8query --info
Publish or Perish publication search utility 8.2.3876.8069 (2022.02.02.0903) Linux (x86_64)
(c) 1990-2022 Tarma Software Research Ltd
Built for Linux (x86_64) using GNU C/C++ (11.2.0)
Running on Linux 5.13.0-28-generic (x86_64) #31-Ubuntu SMP Thu Jan 13 17:41:06 UTC 2022
App data key: .publish_or_perish
App data dir: /home/dave/.publish_or_perish
Roaming settings: /home/dave/.publish_or_perish/settings.json
Local settings: Dummy settings; not persistent
Root path: /home/dave/.publish_or_perish
Results path: /home/dave/.publish_or_perish/Results6
Temp path: /home/dave/.publish_or_perish/Temp
Trash path: /home/dave/.publish_or_perish/Trash
Log path: /home/dave/.logs/pop8query.log
Configuration
Some data sources, for example Microsoft Academic and Scopus, require an API key for their use. The command line edition of Publish or Perish uses the same configuration settings as the GUI edition, so we recommend that you use the GUI version of Publish or Perish to set up the desired API keys before using the command line edition.
If there is no GUI edition of Publish or Perish for your platform, then you can copy the settings.json file from another platform; the configuration settings are platform-independent.
Use the pop8query --info command to see where the settings.json file is stored for the command line tool (and the GUI edition, if any).
Syntax
- pop8query --help
- pop8query --version
- pop8query --info [--all|datasource] [outfile]
- pop8query [datasource] queryfields [other options] [outfile]
- --help
- Print a syntax summary, then exit with exit code 0; no search is performed.
- --version
- Print the program's version, then exit with exit code 0; no search is performed.
- --info
- Prints configuration information, then exit with exit code 0; no search is performed. By default only version and path information are included; to see data source settings, use --all for all data sources or specify a data source (see next) to see information about that data source only.
Without --help, --version, or --info a publication search is performed. By default Google Scholar is used as the data source, but you can specify another data source through the following data source options:
- --crossref
- Crossref search
- --gsauthor
- Google Scholar Profile author search
- --gscholar
- Google Scholar search
- --gsciting
- Google Scholar citing references search
- --gsprofile
- Google Scholar Profile retrieval
- --masv2
- Microsoft Academic v2 search
- --openalex
- OpenAlex search
- --pubmed
- PubMed search
- --scopus
- Scopus search
- --semscholar
- Semantic Scholar search
- --wos
- Web of Science search
After the publication search the results are optionally sorted, then written to outfile if specified, else to stdout. Any diagnostic information and all progress information is written to stderr, so you can redirect them separately if desired.
The data output format is the first of:
- the format specified by the --format option (see below)
- the extension of outfile, if any:
.csv CSV (comma-separated values)
.json JSON (JavaScript object notation)
.jsonl JSON Lines (JSON with 1 whole record per line)
.rtf RTF (full search report as Rich Text)
.tsv TSV (tab-separated values) - the default, which is currently CSV output.
For publication searches, the following query field options are available. Please note that:
- Not all data sources support all query fields. Unsupported fields are reported, but otherwise ignored.
- The per-field syntax is similar to the one used in the GUI edition and may include "quotes" and boolean operators such as AND and OR -- but see the next point.
- If entered on a shell command line (as opposed to a programmatically constructed command line or an argument file used with the -f option, below) you may have to use additional quoting to prevent the shell from eating or misinterpreting "quotes" or other special characters. The details depend on the shell in question, but for most shells enclosing the entire field expression (but not the --option itself) in an additional set of 'single' quotes will be sufficient.
Query fields:
- --affiliation affiliation
- Specify the author affiliation(s) for which to search. This is not widely supported.
- --author authorspec
- Specify the author(s) for which to search.
- --citedid identifier
- Specify the data source's document ID for the document that you want to retrieve the citing references for. This is currently only supported for --gsciting
- --field fieldofstudy
- Specify the field(s) of study for which to search. This is not widely supported and the field names are often idiosyncratic.
- --issn issn
- Specify the ISSN for which to search.
- --journal journalname
- Specify the journal(s) for which to search.
- --keywords words
- Specify the keyword(s) for which to search; these may occur in the abstract, full text, or even (also) in the title; the details depend on the data source.
- --raw syntax
- Use syntax verbatim for the query; if this options is used, all other query fields are ignored. The syntax must fit the data source in question.
- --title words
- Specify the title word(s) for which to search.
- --years from-to
- Specify the first and last years of the publications in which you are interested.
Executive options:
- --direct
- Bypass the Publish or Perish cache and submit the query directly to the data source, even if the query would otherwise be satisfied from the cache.
- --dryrun
- Perform all actions except submitting the query to the data source. This is useful to check that the configuration is correct.
- -f argfile
- Read additional options from argfile (one option per line) and process them at that point of the command line. This allows more complicated options to be specified without additional quoting and also allows annotation of options with comments (comment lines start with a # character). Note that you must use an equals sign '=' between the option name and its value when specifying options in the file (this is also allowed, but not required, on the command line). Example file contents:
-
# This is a comment
--author="a harzing" AND "d adams"
--keywords=publish perish
--years=2000-2018 - --max number
- Retrieve no more than number results, even if more would be available.
- Note: Because of the granularity of the request size (typically 10, 20, or 100, depending on the data source), the final number of results may be rounded up to the next multiple of the request size.
- --maxage hours
- Submit the query directly to the data source if the query's cached data are older than hours hours. This overrides the builtin cache period.
- --syntax
- Create an appropriate query, then print the resulting query syntax as it would be sent to the data source. No query is actually submitted.
- Note: Some data sources treat year ranges as a separate filtering option instead of a query field. The value of the --years option may therefore be missing from the query syntax; this does not mean that it was somehow "forgotten". It just means that it is a non-syntax parameter for the data source in question.
- --wait secs
- Pause for secs seconds between query requests. This is a simple form of rate limiting and is mainly useful to spread the requests over a longer period than Publish or Perish's builtin adaptive rate limiter would do.
Output options:
- --append
- Append to 'outfile' instead of overwriting. This is useful to collate the output of several separate batch searches into a single file, but is less appropriate for some forms of output, such as rich text (RTF) search reports, that each contain their own document headers and trailers.
- --format apa|bibtex|chicago|csiro|csv|harvard|
isi| json| jsonl| mla| ris| rtf| rtfshort| tsv| vancouver - Write the results in the given format. This overrides the default or implied (through the extension of outfile, if any) data format. For further processing the json format is recommended, because that contains the most complete information available.
- --sort [-]author|title|source|year|
cites| cites_annual| cites_norm| rank - Sort the output in ascending or descending (if prefixed with '-') according to the given field. The default results order is rank because that is simply the order in which the results are returned by the data source.
Exit codes
The program returns one of the following exit codes:
- 0
- Success: one or more matches for the query. If --help, --version, or --info are specified, this exit code merely means "no errors".
- 1
- An error occurred in the command line parameters or in the program execution generally.
- 2
- Syntax error in the query parameters; this includes missing or empty query fields (but not unsupported fields -- those are simply ignored without generating an error code).
- 3
- The query could not be executed, for example because the data source was unavailable.
- 4
- The query returned no matches.
Support Publish or Perish
The development of the Publish or Perish software is a volunteering effort that has been ongoing since 2006. Download and use of Publish or Perish is and will remain free (gratis), but your support toward the costs of hosting, bandwidth, and software development are appreciated. Your support helps further development of Publish or Perish for new data sources and additional features.
Copyright © 2023 David Adams. All rights reserved. Page last modified on Wed 10 May 2023 14:24
Web master of Harzing.com and developer of the Publish or Perish software, among other things. He holds BSc and MSc degrees in Electrical Engineering, a PhD in Operations Research, and likes to watch academic life from a safe distance.