Towards the long-overdue open graph of citations
April 7, 2017
It’s baffled me for years that there is no open graph of scholarly citations — a set of machine-readable statements that (for example) Taylor et al. 2009 cites Stevens and Parrish 1999, which cites Alexander 1985 and Hatcher 1901.
With such a graph, you would be able to answer question like “what subsequent publications have cited my 2005 paper but not my 2007 paper?” and of course “Has paper X been rebutted in print, or do I need to do it?”
At a more basic level, it’s ridiculous that every one of us maintains our own citation database for our own work. It makes no sense that there isn’t a single, global, universally accessible citation database which all of us can draw from for our bibliographies.
Today we welcome the Initiative for Open Citations (I4OC), which is going to fix that. I’m delighted that someone is stepping up to the plate. It’s been a critical missing piece of scholarly infrastructure.
As far as I can see, I4OC is starting out by encouraging publishers to sign up for CrossRef’s existing Cited-by service. This is a great way to capture citation information going forward; but I hope they also have plans for back-filling the last few centuries’ citations. There are a lot of ways this could be done, but one would be crowdsourcing contributions. They have good people involved, so I’m optimistic that they’ll get on this.
By the way, this kind of thing — machine-readable data — is one area where preprints genuinely lose out compared to publisher-mediated versions of articles. Publishers on the whole don’t do nearly enough to earn their very high fees, but one very real contribution they do make is the process that is still, for historical reasons, known as “typesetting” — transforming a human-readable manuscript into a machine-readable one from which useful data can be extracted. I wonder whether preprint repositories of the future will have ways to match this function?