Web of Science: How to be robbed of 10 years of citations in one week!

Having studied Google Scholar and Microsoft Academic for many years, I would be the first to acknowledge these free databases have their fair share of accuracy problems. However, we shouldn't forget the Web of Science and Scopus are by no means perfect either, especially when it concerns non-traditional publications.

thief

Over the years, my computer program Publish or Perish has accumulated quite a few citations. It is my most-cited work, not just in Google Scholar, but also in the Web of Science, narrowly beating my "When Knowledge Wins: Transcending The Sense and Nonsense of Academic Rankings" article. Obviously, in the Web of Science one can only find citations to Publish or Perish by using a "Cited Reference" search as it not a traditional journal publication and thus not included in the standard Web of Science database.

Over the years I have had to submit at least 50 data change reports for this publication as ISI data entry typists enter a reference to the program as a separate publication, even when differences with the master record are infinitely small, such as the referring author using one instead of two initials for my name or the referring author referring to a specific release of the software (and there have been over 200 of them!).

However, after all that work I had a "neat" master record with 261 citations. Or so I thought... Three weeks ago I noticed to my considerable alarm that all citations to Publish or Perish were now attributed to Peter Jacso, who I can assure you had nothing to do with the program! As can be seen in the screenshot below, the program has suddenly acquired a volume, issue and page number as well, as if it is a journal article. 

And lo and behold, if we click on "show expanded titles", we find that ISI has attributed all citations to my computer program Publish or Perish to Jacso's article about Google Scholar. This is particularly hurtful as many of Jacso's opions about Google Scholar voiced in this article are diametrically opposed to my own. Obviously, I immediately submitted another data change report requesting immediate reinstatement as the rightful "owner" of these citations and an apology for this egregious error.

Strangely enough the citations still seem to be linked to my name in some way as the above record shows up when you conduct the search shown below. It doesn't show up when you search for Jacso. But that's a small solace. I do hope I'll get the citations back at some stage, but I wouldn't be surprised if reinstating my ownership requires many more emails. Let's hope Clarivate, the Web of Science's new owner, proves me wrong.

Even so, don't assume that just because your university pays a hefty subsription fee for your Web of Knowledge subscription, you can rely on it to be fully accurate! Although Google Scholar and Microsoft Academic are by no means fully accurate, they never attributed all my citations for a publication to someone else in their databases, let alone someone completely unrelated to the publication in question.

Note 1: When double-checking before posting this blog I noticed I have received my citations back. Well done Clarivate! I am still waiting for the apology though!

Note 2: Eight hours after posting this blog, I received a formal apology.

First I want to sincerely apologize for this error and any inconveniences this error may have caused you. We have identified the root cause of this problem and are working with our production teams to take measures to prevent errors such as this in the future. We have made the necessary corrections to this citation, which now appears corrected in Web of Science.
I understand your frustration and how upsetting it is to see errors like this appear on your citations. We hope to have a solution in place in the future that will also help with merging your citations under the original publication year. This may take some time, but we are working on measures to improve this process and address your requests to merge your citations the first time.
My sincere apologies again for any inconveniences this has cause [sic] you.

Update 21 July 2017

About four months after I wrote this, Clarivate started to attribute all citations to Publish or Perish to yet another publication, my white paper "Reflections on the h-index", published in 2008. Although this white paper has acquired nearly 20 citations (that are now "lost"), I didn't expect it to acquire nearly 300 additional ones in one week :-) Again though the record does somehow still seems to be linked to Publish or Perish some way as this is the output when I search for harzing and publish*. So I submitted yet another data change report...

Update 27 July 2017

Another week, another round of data change reports to Clarivate. Two of them were innocent enough. Slight variations in e.g. page numbers, issue or author initials are enough for data entry operators to create separate records. I am used to that and submit data change reports for these virtually every week. However, the saga of the missing citations to the Publish or Perish software (2007) continues. This week, all its citations are attributed to the Publish or Perish Book, published in 2010. Well at least we are getting closer, but I wish they would stop messing with this record...

Update 2 August 2017

Hurray!! My change report seems to have done the trick. Citations are now attributed to the Publish or Perish software again. Phew... what a relief.

The expanded version even shows the full name and adds [computer software]. Neat!!! Can you now leave this record alone Clarivate? Pretty please? Just add the new citations to this record and I can stop adding new blunders :-)

Update 6 June 2018

Citations to the Publish or Perish software had been quietly accumulating for the last 10 months; the sofware had just broken the 400 cites barrier. So all was well in the wonderous world of citation analysis and I really thought we had put this problem to bed. Hence when yet another stray citation to Publish or Perish appeared this week, I was innocently doing my regular search to find the master record [see screenshot] and prepare a data change report.

I almost fell off my chair when I saw the result. Yes there were three stray cites, the 2017 one that had been added to the database this week and the 2016 and 1990 ones that had been added last week [and whose data change reports had not been processed yet].

But apart from that all of the more than 400 cites to Publish or Perish were gone! Clarivate had ascribed them all to an 2015 article by Emilio Delgado Lopez-Cozar in the Spanish journal Relieve (Revista ELectrónica de Investigación y EValuación Educativa). Please note that the original article is in Spanish (see below), but the journal also publishes English versions of the articles.

Now I very much like Emilio and his team's work on Google Scholar. I have even written a very complimentary prologue to EC3's wonderful book on Google Scholar (see Sacrifice a little accuracy for a lot more comprehensive coverage) and we both presented on the Google Scholar day organised by Isidro Aguillo's Cybermetrics lab in Madrid (Publish or Perish: Realising Google Scholar's potential to democratise citation analysis).

So, I really didn't mind this mistake as much as what happened in February 2017 when Clarivate assigned all citations to the Publish or Perish software to Peter Jacso [see above], whose views on Google Scholar are diametrically opposed to my own. I would gladly give Emilio a nice present :-). However, Emilio would be the first to agree with me that accurate citation metrics are crucial.

So again Clarivate, pretty please, can you stop messing with this record and give me my citations back?

Surely this can't be true?

And if you don't believe me and think I must be making this up as it all sounds too weird to be true, here is some very brief evidence. First, well over half of the citations to Emilio's 2015 article apparently occurred before it was even published. Quite a feat! [The Publish or Perish software's first official launch was in 2007]

Second, Emilio's article has fifteen citations in Google Scholar for the original Spanish title, with another 2 citations for the English title. Not bad, but nowhere near the over 400 citations his article apparently acquired in the Web of Science. And although I know of many cases where Google Scholar citations are substantially higher than WoS citations, I have yet to come across the first case where WoS citations are substantially higher than Google Scholar citations, let alone 24 times as high.

I also checked the ten last articles that purported to cite Emilio's paper. I did recognize their titles as I had come across them in recent months as articles that were citing the Publish or Perish software. However, in every single one of these recent articles that [originally] cited Publish or Perish, the reference to Publish or Perish had been replaced by a reference to Emilio's article. That is, it has been replaced in the Clarivate database representation of the articles, the originally published articles obviously still refer to the PoP software. Here is just one example.

Clarivate references

Original article references

[Please note that Eagly's Social role theory article appeared on the next page as it was listed after the Role congruity theory article by the same author, as would be expected in a proper alphabetical listing of authors]

Clarivate references

Original article references

What the heck?

So what the heck is going on? What weird sort of gremlin in the Web of Science database or WoS matching algorithms could have caused this incredible mix-up? It is late, it has been a long day, and I have a full day of meetings tomorrow. So I give up for now, but if anyone finds out please let me know and I'll post it here!! For a Twitter exchange with some speculations, see here.

Update 14 June

Clarivate did send me a tweet and email last week to let me know that they were working on fixing the problem above. When I logged in today to do my weekly check of new citations, I did notice that I had received my citations back (see screenshot). Unfortunately, the record *still* links to the DOI of Emilio's article. I also did not receive the requested explanation of why the mismatch happened in the first place, so cannot tell you more about this. I will keep you tuned if I hear any more from Clarivate.

Update 22 June

Today I received a very carefully crafted email from Clarivate's Director of Content Management:

I wanted to reach out to you directly to provide an update regarding the citation unification issue to your Publish or Perish cited reference. As you are aware, the citations to this record re-clustered to another record in the Web of Science earlier this month. We determined that the root cause for this was a manual correction error.  We have now confirmed that all related citations to your record no longer contain any metadata from the Emilio Delgado Lopez-Cozar record, including the DOI. You should receive separate notification that your correction case has been resolved and is now closed.

Similar to your periodic requests to unify stray citations to your Publish or Perish cited reference, it appears a separate request requiring a manual override linked the Emilio Delgado Lopez-Cozar record to your Publish or Perish cited reference in error. Since the Emilio Delgado Lopez-Cozar record is a Web of Science source record, a subsequent update to that record triggered the updated version to go through the unification processes for linking cited references to source records.  With your Publish or Perish cited reference incorrectly linked to that indexed Web of Science record, his record became the default version displayed in the Cited Reference search, and the citations to Publish or Perish were pulled to that record. Re-processing updated records for unification and defaulting to the Web of Science source record as the primary cited reference is by design so that the most current metadata is used for improved unification and to support the ability to link cited references to source material we might add to coverage at a later time.

To avoid a reoccurrence of this issue, we are putting additional checks in place to track and verify future changes for this particular record, as well as reviewing the policies that cover the activity which introduced the error. I apologize for any inconvenience this may have caused as we appreciate your efforts to insure a complete and accurate citation count to the Publish or Perish software program in the Web of Science.

When checking the Web of Science I found that the record had indeed been corrected as promised. The dots between my initials have disappeared again, but let's not quibble with that. I'll settle for any representations of my complicated intials [see Bank error in your favour? How to gain 3,000 citations in a week] as long as I can keep my citations now :-)