In a recent blog-post, Kevin Smith tells it like it is: legacy publishers are tightening their grip in an attempt to control scholarly communications. “The same five or six major publishers who dominate the market for scholarly journals are engaged in a race to capture the terms of and platforms for scholarly sharing”, says Smith. “This is a serious threat to academic freedom.”

A fisted hand tightly gripping US Currency.People can legitimately have different ideas about precisely what it is that Elsevier intends to do with SSRN, now that it’s acquired it. But as we discuss the possible outcomes, we need to keep one principle in mind: it’s simply unrealistic to imagine that Elsevier, in controlling Mendeley and SSRN, will do anything other than what is best for Elsevier.

That’s not a criticism, or even a complaint. It’s a statement of what a for-profit corporation does. It’s in its nature. There’s no need for us to blame Elsevier for this, any more than we blame a fox when it eats a chicken. That’s what it does.

The appropriate response is simply to prevent any more of this kind of thing happening, by taking control of our own scholarly infrastructure.

The big problem with SSRN is the same as the big problem of Mendeley: being privately owned and for-profit, their owners were always going to be susceptible to a good enough offer. People starting private companies are looking to make money from them, and a corporation that comes along with a big offer is a difficult exit strategy to resist. When we entrusted preprints to SSRN, they were always vulnerable to being taken hostage, in a way that arXiv preprints are not.

Again: I am not blaming private companies’ owners for this. It’s in the nature of what a private company is. I recognise that and accept it. The thing is, I interpret it as damage and want to route around it.

So what is the solution?


It’s simple. We, the community, need to own our own infrastructure.

One one level, this is easy. We, the community, know how to do it. We have experience of good and bad infrastructure, we know the difference. We have excellent, clearly articulated principles for open scholarly infrastructure. We have top quality software engineers, interaction designers, UI experts and more.

What we don’t have is funding. And that is crippling.

We can’t build and maintain community-owned infrastructure without funding; and (to a first approximation anyway) no-one is funding it. It’s truly disgraceful that even such a crucial piece of infrastructure of arXiv is constantly struggling for funding. arXiv serves about a million articles per week, and is the primary source of publications in many scientific subfields, yet every year it struggles to bring in the less then a million dollars it costs to run. It’s ridiculous the the Gates Foundation or someone hasn’t come along with a a few tens of millions of dollars and set up a long-term endowment to make arXiv secure.

And when even something as proven as arXiv struggles for funding, what chance does anything else have?

The problem seems to be this: funders have a blind spot when it comes to funding infrastructure. That’s why we have no UK national repository; it’s why there is no longer an independent subject repository for social sciences; it’s why the two main preprint archives for bio-medicine (PeerJ Preprints and BioRxiv) are privately owned, and potentially vulnerable to the offer-you-can’t-refuse from Elsevier or one of the other legacy publishers in the oligopoly(*).


When you think about funders — RCUK, Wellcome, NIH, Gates, all of them — they are great at funding research; and terrible at funding the infrastructure that allows it to have actual benefit. Most funders even seem to have specific policies that they won’t fund infrastructure; those that don’t, simply lack a way to apply for infrastructure funding. It’s a horribly short-sighted approach, and we’re seeing its inevitable fruit in Elsevier’s accumulation of infrastructure.

We’ll look back at funding bodies in 10 or 20 years and say their single biggest mistake was failing to see the need to fund infrastructure.

Please, funders. Fix this. Make whatever changes you need to make, to ensure the the scholarly community owns and controls its own preprint archives, subject repositories, aggregators, text-mining tools, citation graphs, metrics tools and what have you. We’ve already seen what happens when we cede control of the scholarly record to corporations: spiralling prices, poor quality product, arbitrary barriers, and the retardation of all progress. Let’s not make the same mistake again with infrastructure.



(*) Actually, I don’t believe PeerJ’s owners would sell their preprint server to Elsevier for any amount of money — and the same may be true of the BioRxiv for all I know, I’ve never spoken with the owners. But who can tell what might happen?


A quick note to say that I got an email today — the University of Bristol Staff Bulletin — announcing some extremely welcome news:


(Admittedly it was only the third item on the bulletin, coming in just after “Staff Parking – application deadline Friday 18 September”, but you can’t have everything.)

This is excellent, and the nitty-gritty details are encouraging, too. Although HEFCE recently wound back its own policy, as a transition-period concession, to requiring deposit only at the time of publication, Bristol has quite properly gone with the more rigorous requirement that accepted manuscripts be deposited at the time of acceptance. This is wise for the university — it’s future-proofed against HEFCE’s eventual move back towards the deposit-on-acceptance policy that it wanted — and it’s good for the wider world, too.



You know what’s wrong with scholarly publishing?

Wait, scrub that question. We’ll be here all day. Let me jump straight to the chase and tell you the specific problem with scholarly publishing that I’m thinking of.

There’s nowhere to go to find all open-access papers, to download their metadata, to access it via an open API, to find out what’s new, to act as a platform for the development of new tools. Yes, there’s PubMed Central, but that’s only for work funded by the NIH. Yes, there’s Google Scholar, but that has no API, and at any moment could go the way of Google Wave and Google Reader when Google loses interest.

Instead, we have something like 4000 repositories out there, balkanised by institution, by geographical region, and by subject area. They have different UIs, different underlying data models, different APIs (if any). They’re built on different software platforms. It’s a jungle out there!


As researchers, we don’t need 4000 repos. You know what we need? One Repo.

Hey! That would be a good name for a project!

I’ve mentioned before how awesome and pro-open my employers, Index Data, are. (For those who are not regular readers, I’m a palaeontologist only in my spare time. By day, I’m a software engineer.) Now we’re working on an index of green/gold OA publishing. Metadata of every article across every repository and publisher. We want it to be complete, in the sense that we will be going aggressively for the long tail as opposed to focusing on some region or speciality, or things that are easily harvestable by OAI-PMH or other standards. We want it to be of a high, consistent quality in terms of metadata. We want it to be up to date. And most importantly, we want it to be fully open for all and any kind of re-use, by any other actor. This will include downloadable data files, OAI-PMH access, search-retrieve web services, embeddable widgets and more. We also envisage a Linked Data representation with a CRUD interface that allows third parties to contribute supplemental information, entity reconciliation, tagging, etc.

Instead of 4000 fragments, one big, meaty chunk of data.


Because we at Index Data have spent the last ten years helping aggregators and publishers and others getting access to difficult-to-access information through all kinds of crazy mechanisms, we have a unique combination of the skills, the tools, and the desire to pursue this venture.

So The One Repo is born. At the noment, we have:

  • Harvesting set up for an initial set of 20 repositories.
  • A demonstrator of one possible UI.
  • A whitepaper describing the motivation and some of the technical aspects.
  • A blog about the project’s progress.
  • An advisory board of some of the brightest, most experienced and wisest people in the world of open access.

We’ve been flying under the radar for the last month and a bit. Now we’re ready for the world to know what we’re up to.

The One Repo is go!

Just a quick post today, to refute an incorrect idea about open access that has unfortunately been propagated from time to time. That is the idea that if (say) PLOS were acquired by a barrier-based publisher such as Taylor and Francis, then its papers could be hidden behind paywalls and effectively lost to the world. For example, in Glyn Moody’s article The Open Access Schism, Heather Morrison is quoted as follows:

A major concern about the current move towards CC-BY is that it might allow re-enclosure by companies […] This is a scenario suggested by assistant professor in the School of Information Studies at the University of Ottawa Heather Morrison. As she explains, “There is nothing in the CC BY license that would stop a business from taking all of the works, with attribution, and selling them under a more restrictive license—not only a more restrictive CC-type license (STM’s license is a good indication of what could happen here), but even behind a paywall, then buying out the OA publisher and taking down the OA content.”

This is flatly incorrect.

Reputable open-access publishers not only publish papers on their own sites but also place them in third-party archives, precisely to guard against doomsday scenarios. If (say) PeerJ were made an offer they couldn’t refuse by Elsevier, then the new owners could certainly shut down the PeerJ site; but there’s nothing the could do about the copies of PeerJ articles on PubMed Central, in CLOCKSS and elsewhere. And of course everyone who already has copies of the articles would always be free to distribute them in any way, including posting complete archives on their own websites.

Let’s not accept this kind of scaremongering.


[NOTE: see the updates at the bottom. In summary, there’s nothing to see here and I was mistaken in posting this in the first place.]

Elsevier’s War On Access was stepped up last year when they started contacting individual universities to prevent them from letting the world read their research. Today I got this message from a librarian at my university:


The irony that this was sent from the Library’s “Open Access Team” is not lost on me. Added bonus irony: this takedown notification pertains to an article about how openness combats mistrust and secrecy. Well. You’d almost think NPG wants mistrust and secrecy, wouldn’t you?

It’s sometimes been noted that by talking so much about Elsevier on this blog, we can appear to be giving other barrier-based publishers a free ride. If we give that impression, it’s not deliberate. By initiating this takedown, Nature Publishing Group has self-identified itself as yet another so-called academic publisher that is in fact an enemy of science.

So what next? Anyone who wants a PDF of this (completely trivial) letter can still get one very easily from my own web-site, so in that sense no damage has been done. But it does leave me wondering what the point of the Institutional Repository is. In practice it seems to be a single point of weakness allowing “publishers” to do the maximum amount of damage with a single attack.

But part of me thinks the thing to do is take the accepted manuscript and format it myself in the exact same way as Nature did, and post that. Just because I can. Because the bottom line is that typesetting is the only actual service they offered Andy, Matt and me in exchange for our right to show our work to the world, and that is a trivial service.

The other outcome is that this hardens my determination never to send anything to Nature again. Now it’s not like my research program is likely to turn up tabloid-friendly results anyway, so this is a bit of a null resolution. But you never know: if I happen to stumble across sauropod feather impressions in an overlooked Wealden fossil, then that discovery is going straight to PeerJ, PLOS, BMC, F1000 Research, Frontiers or another open-access publisher, just like all my other work.

And that’s sheer self-interest at work there, just as much as it’s a statement. I will not let my best work be hidden from the world. Why would anyone?

Let’s finish with another outing for this meme-ready image.

Publishers ... You're doing it wrong

Update (four hours later)

David Mainwaring (on Twitter) and James Bisset (in the comment below) both pointed out that I’ve not seen an actual takedown request from NPG — just the takedown notification from my own library. I assumed that the library were doing this in response to hassle from NPG, but of course it’s possible that my own library’s Open Access Team is unilaterally trying to prevent access to the work of its university’s researchers.

I’ve emailed Lyn Duffy to ask for clarification. In the mean time, NPG’s Grace Baynes has tweeted:

So it looks like this may be even more bizarre than I’d realised.

Further bulletins as events warrant.

Update 2 (two more hours later)

OK, consensus is that I read this completely wrong. Matt’s comment below says it best:

I have always understood institutional repositories to be repositories for author’s accepted manuscripts, not for publisher’s formatted versions of record. By that understanding, if you upload the latter, you’re breaking the rules, and basically pitting the repository against the publisher.

Which is, at least, not a nice thing to do to the respository.

So the conclusion is: I was wrong, and there’s nothing to see here apart from me being embarrassed. That’s why I’ve struck through much of the text above. (We try not to actually delete things from this blog, to avoid giving a false history.)

My apologies to Lyn Duffy, who was just doing her job.

Update 3 (another hour later)

This just in from Lyn Duffy, confirming that, as David and James guessed, NPG did not send a takedown notice:

Dear Mike,

This PDF was removed as part of the standard validation work of the Open Access team and was not prompted by communication from Nature Publishing. We validate every full-text document that is uploaded to Pure to make sure that the publisher permits posting of that version in an institutional repository. Only after validation are full-text documents made publicly available.

In this case we were following the regulations as stated in the Nature Publishing policy about confidentiality and pre-publicity. The policy says, ‘The published version — copyedited and in Nature journal format — may not be posted on any website or preprint server’ ( In the information for authors about ‘Other material published in Nature’ it says, ‘All articles for all sections of Nature are considered according to our usual conditions of publication’ ( We took this to mean that material such as correspondence have the same posting restrictions as other material published by Nature Publishing.

If we have made the wrong decision in this case and you do have permission from Nature Publishing to make the PDF of your correspondence publicly available via an institutional repository, we can upload the PDF to the record.

Kind regards,
Open Access Team


Here’s the text of the original notification email so search-engines can pick it up. (If you read the screen-grab above, you can ignore this.)

University of Bristol — Pure

Lyn Duffy has added a comment

Sharing: public databases combat mistrust and secrecy
Farke, A. A., Taylor, M. P. & Wedel, M. J. 22 Oct 2009 In : Nature. 461, 7267, p. 1053

Research output: Contribution to journal › Article

Lyn Duffy has added a comment 7/05/14 10:23

Dear Michael, Apologies for the delay in checking your record. It appears that the document you have uploaded alongside this record is the publishers own version/PDF and making this version openly accessible in Pure is prohibited by the publisher, as a result the document has been removed from the record. In this particular instance the publisher would allow you to make accessible the postprint version of the paper, i.e., the article in the form accepted for publication in the journal following the process of peer review. Please upload an acceptable version of the paper if you have one. If you have any questions about this please get back to us, or send an email directly to Kind regards, Lyn Duffy Library Open Access Team.

This morning sees the publication of the new Policy for open access in the post-2014 Research Excellence Framework from HEFCE, the Higher Education Funding Council for England. It sets out in details HEFCE’s requirement that papers must be open-access to be eligible for the next (post-2014) Research Excellence Framework (REF).

Here is the core of it, quoted direct from the Executive Summary:

The policy states that, to be eligible for submission to the post-2014 REF, authors’ final peer-reviewed manuscripts must have been deposited in an institutional or subject repository on acceptance for publication. Deposited material should be discoverable, and free to read and download, for anyone with an internet connection […]  The policy applies to research outputs accepted for publication after 1 April 2016, but we would strongly urge institutions to implement it now.

There are lots of ifs, buts and maybes, but overall this is excellent news, and solid confirmation that the UK really is committed to an open-access transition. Before we go into those caveats, let’s take a moment to applaud the real, significant progress that this policy represents. For the first time ever, universities’ funding levels, and so individual academics’ careers, will be directly tied to the openness of their output. Congratulations to HEFCE!


Also commendable: the actual policy document is very carefully written, and includes details such as “Outputs whose text is encoded only as a scanned image do not meet the requirement that the text be searchable electronically.” It’s evident that a lot of careful thought has gone into this.

Now for those caveats:

The policy will not apply to monographs, book chapters, other long-form publications, working papers, creative or practice-based research outputs, or data.

This is a shame, but understandable, especially in the case of books. I would have hoped that chapters within edited volumes would have been included. But the main document notes that “Where a higher education institution (HEI) can demonstrate that it has taken steps towards enabling open access for outputs outside the scope of this definition, credit will be given in the research environment component of the post-2014 REF.”

Next disappointment:

The policy allows repositories to respect embargo periods set by publications. Where a publication specifies an embargo period, authors can comply with the policy by making a ‘closed’ deposit on acceptance. Closed deposits must be discoverable to anyone with an Internet connection before the full text becomes available for read and download (which will occur after the embargo period has elapsed). Closed deposits will be admissible to the REF.

I would of course have wanted all embargo periods to be eliminated, or at the very least capped at six months as in the old, pre-watering-down, RCUK policy. But that was too much to hope for in the political environment that publishers have somehow managed to create.

More positively, it’s a good sop that deposit must be made on acceptance — not when the embargo expires, or even on publication, but on acceptance. These “closed deposits” are like a formal promise of openness, with an automated implementation. We don’t have good experimental data on this, but it seems likely that this approach will result in much better compliance rates than just telling authors “you have to come back six to 24 months after publication and make a deposit”.

Third disappointment:

There are a number of exceptions to the various requirements that will be automatically allowed by the policy. These exceptions cover circumstances where deposit was not possible, or where open access to deposited material could not be achieved within the policy requirements. These exceptions will allow institutions to achieve near-total compliance, but the post-2014 REF will also include a mechanism for considering any other exceptional cases where an output could not otherwise meet the requirements.

The exceptions encourage weasel-wordage, of course, and some of the specific exceptions listed in Appendix C are particularly weak: “Author was unable to secure the use of a repository”, “Publication is print-only (no electronic version)”, and the lamentable “Publication does not offer a compliant green or gold option”, which really means “HEFCE authors should not be using this publication”.

But when you read into the details, this approach with specific exceptions is actually rather better than the alternative that had been on the table: a percentage-based target, where some specific proportion of REF submissions would need to be open access. Instead of saying “80% of submissions must be open access” (or some other percentage), HEFCE is saying that it wants them all to be open access except where a specific excuse is given. I’d like them to be much less accommodating with what excuses they’ll accept, but the important thing here is that they have set the default to open.

Now for the most regrettable part of the policy:

While we do  not request that outputs are made available under any particular licence, we advise that outputs licensed under a Creative Commons Attribution Non-Commercial Non-Derivative (CC BY-NC-ND) licence would meet this requirement.

I won’t rehearse again all the reasons that Non-Commercial and No-Derivatives clauses are poison, I’ll just note that works published under this licence are not open access according to the original definition of that term, which allows us to “use [OA works] for any other lawful purpose, without financial, legal, or technical barriers”.

Yet even here, the general tenor of the policy is positive. While it accepts NC-ND, the policy adds that “where an HEI can demonstrate that outputs are presented in a form that allows re-use of the work, including via text-mining, credit will be given in the research environment component of the post-2014 REF”.

One last observation: HEFCE should be commended on having provided an excellent, detailed explanation of feedback they received to their consultations. As always, reading such documents can be frustrating because they necessarily contain some views very different from mine; but it’s useful to see the range of opinions laid out so explicitly.

No open-access policy document I’ve ever seen has been perfect, and this one is no exception. But overall, the HEFCE open-access policy is a significant and welcome step forward, and carries the promise of further positive moves in the future.


Yesterday I was at the Berlin 11 satellite conference for students and early-career researchers. It was a privilege to be part of a stellar line-up of speakers, including the likes of SPARC’s Heather Joseph, PLOS’s Cameron Neylon, and eLIFE’s Mark Patterson. But even more than these, there were two people who impressed me so much that I had to give in to my fannish tendencies and have photos taken with them. Here they are.


This is Jack Andraka, who at the age of fifteen invented a new test for pancreatic cancer that is 168 times faster, 1/26000 as expensive and 400 times more sensitive than the current diagnostic tests, and only takes five minutes to run.  Of course he’s grown up a bit since then — he’s sixteen now.

Right at the moment Jack’s not getting much science done because he’s sprinting from meeting to meeting. He came to us in Berlin literally straight from an audience with the Pope. He’s met Barack Obama in the oval office. And one of the main burdens of his talk is that he’s not such an outlier as he appears: there are lots of other brilliant kids out there who are capable of doing similarly groundbreaking work — if only they could get access to the published papers they need. (Jack was lucky: his parents are indulgent, and spent thousands of dollars on paywalled papers for him.)

Someone on Twitter noted that every single photo of Jack seems to show him, and the people he’s with, in thumbs-up pose. It’s true: and that is his infectious positivity at work. It’s energising as well as inspiring to be around him.

(Read Jack’s guest post at PLOS on Why Science Journal Paywalls Have to Go)

Here’s the other photo:


This is Bernard Rentier, who is rector of the University of Liège. To put it bluntly, he is the boss of the whole darned university — an academic of the very senior variety that I never meet; and of the vintage that, to put it kindly, can have a tendency to be rather conservative in approach, and cautious about open access.

With Bernard, not a bit of it. He has instituted a superb open-access policy at Liège — one that is now being taken up as the model for the whole of Belgium. Whenever members of the Liège faculty apply for anything — office space, promotions, grants, tenure — their case is evaluated by taking into account only publications that have been deposited in the university’s open-access repository, ORBi.

Needless to say, the compliance rate is superb — essentially 100% since the policy came in. As a result, Liège’s work is more widely used, cited, reused, replicated, rebutted and generally put to work. The world benefits, and the university benefits.

Bernard is a spectacular example of someone in a position of great power using that power for good. Meanwhile, at the other end of scale, Jack is someone who — one would have thought — had no power at all. But in part because of work made available through the influence of people like Bernard, it turned out he had the power to make a medical breakthrough.

I came away from the satellite meeting very excited — in fact, by nearly all the presentations and discussions, but most especially by the range represented by Jack and Bernard. People at both ends of their careers; both of them not only promoting open access, but also doing wonderful things with it.

There’s no case against open access, and there never has been. But shifting the inertia of long-established traditions and protocols requires enormous activation energy. With advocates like Jack and Bernard, we’re generating that energy.

Onward and upward!