Preferences - Queries
This dialog box appears when you choose the Tools > Preferences command from the main menu. It allows you to edit a number of settings that affect the way Publish or Perish deals with queries. This dialog box contains the following fields and options.
General
This box contains general options relating to the way Publish or Perish issues queries to the query server (for example, Google Scholar).
Option | Description | ||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Keep cached results for <n> days |
Enter the number of days to keep the query results from queries. The longer this period, the fewer accesses are required to satisfy repeated queries. Any updates in the query results only become visible after the cache period has expired, so you don't want to make this period too long. | ||||||||||||||||||||||||||
Clear the cache | Click this button to clear the entire results cache. This forces subsequent queries to access Google Scholar directly, which might be useful after a (suspected) update on Google Scholar, or if you have reason to believe that the cached results are somehow invalid. | ||||||||||||||||||||||||||
User-Agent string | Set the HTTP User-Agent string that Publish or Perish uses to identify itself to Google Scholar and other query servers. The HTTP User-Agent identification that a client sends to an HTTP server identifies the client program and version. In many practical cases, the string contains more or less detailed information not only about the client program, but also about the operating system it's running on. For that reason, a typical User-Agent string might look like: Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0) For ease of configuration, the User-Agent string field may contain zero or more of the following:
The replaceable field placeholders will be expanded when the Publish or Perish starts and include the following:
Notes
|
Query aging
This box contains options that determine how Publish or Perish ages previously executed queries, as follows.
- When you execute a query, whether a new one or a repeat of an earlier one, the query is stored in the Recent queries folder of the Multi-query center.
- Queries in the Recent queries folder older than a preset number of days are automatically migrated to the Older queries folder.
- Queries in the Older queries folder older than a second preset number of days are automatically migrated to the Trash folder.
- Finally, queries in the Trash folder older than a third preset number of days are automatically deleted.
If at any point you re-execute an earlier query, it is moved back to the Recent queries folder and its age is reset to zero.
The aging of queries only applies to queries that reside in the Recent queries, Older queries, or Trash folders. Queries that reside in other folders of the Multi-query center are not affected by the aging policies.
Option | Description |
---|---|
Maximum age for Recent queries <n> days |
Enter the maximum age for "recent" queries. When a query is older than this number of days, it is moved automatically to the Older queries folder. |
Maximum age for Older queries <n> days |
Enter the maximum age for "older" queries. When a query is older than this number of days, it is moved automatically to the Trash folder. |
Delete Trash queries if older than <n> days |
Enter the number of days after which queries should be deleted from the Trash folder. Tip: to avoid automatic deletion of queries, set this to a high number of days, for example 9999. |
Request rate limiter
This box contains options that determine how Publish or Perish limits the rate at which requests are send to Google Scholar.
Requests are related to queries but are not the same:
- Each query translates to one or more requests for results that are potentially sent to Google Scholar. At the time of writing (March 2013), each request returns up to 20 results, so a single query that returns, say, 150 results in all requires 8 individual requests (7 x 20 full + 1 x 10 partial).
- A request may be satisfied from Publish or Perish's own cache, in which case the request is not sent to Google Scholar (unless you use Lookup Direct).
- If you sent too many requests to Google Scholar or if the requests follow each other too quickly, Google Scholar may block further requests.
The request rate limiter options help you to keep the number of requests sent to Google Scholar down to acceptable levels.
Option | Description |
---|---|
Maximum request rate <n> requests/minute |
Enter the maximum number of requests per minute that Publish or Perish should send to Google Scholar. This is a short-term limit and only takes into account the request rate over the past 60 seconds. If this limit is exceeded, then Publish or Perish will delay sending the next request until the request rate has fallen below the maximum that is set here. |
Use adaptive request rate | Check this box to slow down the request rate when Publish or Perish detects that the request rate approaches certain preset limits; clear this box to keep sending requests at the maximum rate. We recommend that you leave this option checked. |
Respond to CAPTCHAs |
Check this box to display a CAPTCHA dialog box when Google Scholar requests verification of your human status. If you solve the CAPTCHA correctly, then Google Scholar allows further queries. We recommend that you leave this option checked. Note: For the CAPTCHA handling to be functional, you must allow first-party cookies in your Internet Explorer settings, or at least session cookies. You can set the Internet Explorer cookies preferences by choosing the Tools > Internet Options command from the main menu in Publish or Perish, then clicking on the Privacy tab in the Internet Properties dialog box that appears. |
Show request rate warnings | Check this box to display warnings when Publish or Perish detects that the request rate approaches certain preset limits; clear this box to suppress those warnings. We recommend that you leave this option checked. |
Show Yellow warning if rate exceeds <n> requests/hour |
Enter the threshold for "Yellow" query rate warnings. This is a medium-term limit and takes into account the request rate over the past hour. The actual limit is an empirical value; we recommend setting this option to 120 or less. This option is used for two purposes:
|
Show Red warning if rate exceeds <n> requests/hour |
Enter the threshold for "Red" query rate warnings. This is a medium-term limit and takes into account the request rate over the past hour. The actual limit is an empirical value; we recommend setting this option to 150 or less. This option is used for two purposes:
|