Google Scholar: Slow searches

Publish or Perish tutorial

Many academics that have been using Publish or Perish for a while have noticed that searches are now slower than they were in the past. This is to avoid exceeding the maximum acceptable Google Scholar request rate.

Number of results per page 20 (down from 100)

In February 2013 Google Scholar reduced the maximum number of results per page from 100 to 20 (later dropping to 10). This means that Publish or Perish now has to retrieve up to 5 times as many result pages per query in order to show the full results.

  • More page requests mean that Publish or Perish hits the maximum number of requests that Google Scholar allows per hour sooner.
  • If the number of page requests exceeds the maximum that Google Scholar allows, Google Scholar will temporarily block your IP address. This block can last 1-2 days.
  • To avoid hitting the maximum allowable request limit, Publish or Perish now uses an adaptive request rate limiter. This limits the number of requests that are sent to Google Scholar within a given period, both short-term (during the last 60 seconds) and medium term (during the last hour).
  • To achieve the required reduction in requests, Publish or Perish delays subsequent requests for a variable amount of time (up to 1 minute). The higher the recent request rate, the longer the delays.

Net results: queries take longer than before

The net result is that queries will take longer than before. The alternative is being blocked by Google Scholar. We consider the relatively short delays during queries as the lesser evil; hence the adaptive rate limiter.

No effects for occasional search

If you perform queries with few results or only search occasionally, then the request rate limiter will have little or no effect on the query time. In this case, the required delays are short or non-existent, and Publish or Perish will retrieve result pages as fast as it did in the past.

Longer delays for queries with many results or done in short succession

However, if you perform queries that yield many results (several hundred or more) or issue a number of queries in short succession, then the request rate limiter will insert progressively longer delays. This is to keep the overall request rate within acceptable limits. If you want to avoid this, then the best remedy is to spread your queries over the day.

Support Publish or Perish

The development of the Publish or Perish software is a volunteering effort that has been ongoing since 2006. Download and use of Publish or Perish is and will remain free (gratis), but your support toward the costs of hosting, bandwidth, and software development are appreciated. Your support helps further development of Publish or Perish for new data sources and additional features.

Feedback

PS: If you are using Publish or Perish on a regular basis, please take 5 minutes to provide me with some feedback.