Open-access journalist Richard Poynder posted a really good interview today with the Gates Foundation’s Associate Officer of Knowledge & Research Services, Ashley Farley. I feel bad about picking on one fragment of it, but I really can’t let this bit pass:

RP: As you said, Gates-funded research publications must now have a CC BY licence attached. They must also be made OA immediately. Does this imply that the Gates foundation sees no role for green OA? If it does see a role for green OA what is that role?

AF: I wouldn’t say that the foundation doesn’t see value or a role for green open access. However, the policy requires immediate access, reuse and copyright arrangements that green open access does not necessarily provide.

Before I get into this, let me say again that I have enormous admiration for what Ashley Farley and the Gates Foundation are doing for open access, and for open scholarship more widely. But:

The (excellent) Gates policy requires immediate access, reuse and copyright arrangements that gold open access does not necessarily provide, either. It provides them only because the Gates Foundation has quite rightly twisted publishers’ arms, and said you can only have our APCs if you meet our requirements.

And if green open access doesn’t provide immediate access and reuse, then that is because funders have not twisted publishers’ arms to allow this.

It’s perfectly possible to have a Green OA repository in which all the deposited papers are available immediately and licenced using CC By. It’s perfectly possible for a funder, university or other body to have a green OA policy that mandates this.

But it’s true that no-one seems to have a green OA policy that does this.

Why not?

In a recent blog-post, Kevin Smith tells it like it is: legacy publishers are tightening their grip in an attempt to control scholarly communications. “The same five or six major publishers who dominate the market for scholarly journals are engaged in a race to capture the terms of and platforms for scholarly sharing”, says Smith. “This is a serious threat to academic freedom.”

A fisted hand tightly gripping US Currency.People can legitimately have different ideas about precisely what it is that Elsevier intends to do with SSRN, now that it’s acquired it. But as we discuss the possible outcomes, we need to keep one principle in mind: it’s simply unrealistic to imagine that Elsevier, in controlling Mendeley and SSRN, will do anything other than what is best for Elsevier.

That’s not a criticism, or even a complaint. It’s a statement of what a for-profit corporation does. It’s in its nature. There’s no need for us to blame Elsevier for this, any more than we blame a fox when it eats a chicken. That’s what it does.

The appropriate response is simply to prevent any more of this kind of thing happening, by taking control of our own scholarly infrastructure.

The big problem with SSRN is the same as the big problem of Mendeley: being privately owned and for-profit, their owners were always going to be susceptible to a good enough offer. People starting private companies are looking to make money from them, and a corporation that comes along with a big offer is a difficult exit strategy to resist. When we entrusted preprints to SSRN, they were always vulnerable to being taken hostage, in a way that arXiv preprints are not.

Again: I am not blaming private companies’ owners for this. It’s in the nature of what a private company is. I recognise that and accept it. The thing is, I interpret it as damage and want to route around it.

So what is the solution?

bridge

It’s simple. We, the community, need to own our own infrastructure.

One one level, this is easy. We, the community, know how to do it. We have experience of good and bad infrastructure, we know the difference. We have excellent, clearly articulated principles for open scholarly infrastructure. We have top quality software engineers, interaction designers, UI experts and more.

What we don’t have is funding. And that is crippling.

We can’t build and maintain community-owned infrastructure without funding; and (to a first approximation anyway) no-one is funding it. It’s truly disgraceful that even such a crucial piece of infrastructure of arXiv is constantly struggling for funding. arXiv serves about a million articles per week, and is the primary source of publications in many scientific subfields, yet every year it struggles to bring in the less then a million dollars it costs to run. It’s ridiculous the the Gates Foundation or someone hasn’t come along with a a few tens of millions of dollars and set up a long-term endowment to make arXiv secure.

And when even something as proven as arXiv struggles for funding, what chance does anything else have?

The problem seems to be this: funders have a blind spot when it comes to funding infrastructure. That’s why we have no UK national repository; it’s why there is no longer an independent subject repository for social sciences; it’s why the two main preprint archives for bio-medicine (PeerJ Preprints and BioRxiv) are privately owned, and potentially vulnerable to the offer-you-can’t-refuse from Elsevier or one of the other legacy publishers in the oligopoly(*).

BlindSpot_DSC_5842_H2

When you think about funders — RCUK, Wellcome, NIH, Gates, all of them — they are great at funding research; and terrible at funding the infrastructure that allows it to have actual benefit. Most funders even seem to have specific policies that they won’t fund infrastructure; those that don’t, simply lack a way to apply for infrastructure funding. It’s a horribly short-sighted approach, and we’re seeing its inevitable fruit in Elsevier’s accumulation of infrastructure.

We’ll look back at funding bodies in 10 or 20 years and say their single biggest mistake was failing to see the need to fund infrastructure.

Please, funders. Fix this. Make whatever changes you need to make, to ensure the the scholarly community owns and controls its own preprint archives, subject repositories, aggregators, text-mining tools, citation graphs, metrics tools and what have you. We’ve already seen what happens when we cede control of the scholarly record to corporations: spiralling prices, poor quality product, arbitrary barriers, and the retardation of all progress. Let’s not make the same mistake again with infrastructure.

 

 


(*) Actually, I don’t believe PeerJ’s owners would sell their preprint server to Elsevier for any amount of money — and the same may be true of the BioRxiv for all I know, I’ve never spoken with the owners. But who can tell what might happen?

A quick note to say that I got an email today — the University of Bristol Staff Bulletin — announcing some extremely welcome news:

bristol-oa

(Admittedly it was only the third item on the bulletin, coming in just after “Staff Parking – application deadline Friday 18 September”, but you can’t have everything.)

This is excellent, and the nitty-gritty details are encouraging, too. Although HEFCE recently wound back its own policy, as a transition-period concession, to requiring deposit only at the time of publication, Bristol has quite properly gone with the more rigorous requirement that accepted manuscripts be deposited at the time of acceptance. This is wise for the university — it’s future-proofed against HEFCE’s eventual move back towards the deposit-on-acceptance policy that it wanted — and it’s good for the wider world, too.

Hurrah!

 

You know what’s wrong with scholarly publishing?

Wait, scrub that question. We’ll be here all day. Let me jump straight to the chase and tell you the specific problem with scholarly publishing that I’m thinking of.

There’s nowhere to go to find all open-access papers, to download their metadata, to access it via an open API, to find out what’s new, to act as a platform for the development of new tools. Yes, there’s PubMed Central, but that’s only for work funded by the NIH. Yes, there’s Google Scholar, but that has no API, and at any moment could go the way of Google Wave and Google Reader when Google loses interest.

Instead, we have something like 4000 repositories out there, balkanised by institution, by geographical region, and by subject area. They have different UIs, different underlying data models, different APIs (if any). They’re built on different software platforms. It’s a jungle out there!

81zeSfGzaUL._SL1500_

As researchers, we don’t need 4000 repos. You know what we need? One Repo.

Hey! That would be a good name for a project!

I’ve mentioned before how awesome and pro-open my employers, Index Data, are. (For those who are not regular readers, I’m a palaeontologist only in my spare time. By day, I’m a software engineer.) Now we’re working on an index of green/gold OA publishing. Metadata of every article across every repository and publisher. We want it to be complete, in the sense that we will be going aggressively for the long tail as opposed to focusing on some region or speciality, or things that are easily harvestable by OAI-PMH or other standards. We want it to be of a high, consistent quality in terms of metadata. We want it to be up to date. And most importantly, we want it to be fully open for all and any kind of re-use, by any other actor. This will include downloadable data files, OAI-PMH access, search-retrieve web services, embeddable widgets and more. We also envisage a Linked Data representation with a CRUD interface that allows third parties to contribute supplemental information, entity reconciliation, tagging, etc.

Instead of 4000 fragments, one big, meaty chunk of data.

bodyCover_334

Because we at Index Data have spent the last ten years helping aggregators and publishers and others getting access to difficult-to-access information through all kinds of crazy mechanisms, we have a unique combination of the skills, the tools, and the desire to pursue this venture.

So The One Repo is born. At the noment, we have:

  • Harvesting set up for an initial set of 20 repositories.
  • A demonstrator of one possible UI.
  • A whitepaper describing the motivation and some of the technical aspects.
  • A blog about the project’s progress.
  • An advisory board of some of the brightest, most experienced and wisest people in the world of open access.

We’ve been flying under the radar for the last month and a bit. Now we’re ready for the world to know what we’re up to.

The One Repo is go!

Provoked by Mike Eisen’s post today, The inevitable failure of parasitic green open access, I want to briefly lay out the possible futures of scholarly publishing as I see them. There are two: one based on what we now think of as Gold OA, and one on what we think of as Green OA.

Eisen is of course quite right that the legacy publishers only ever gave their blessing to Green OA (self-archiving) so long as they didn’t see it as a threat, so the end of that blessing isn’t a surprise. (I think it’s this observation that Richard Poynder misread as “an OA advocate legitimising Elsevier’s action”!) It was inevitable that this blessing would be withdrawn as Green started to become more mainstream — and that’s exactly what we’ve seen, with Elsevier responding to the global growth in Green OA mandates with a regressive new policy that has rightly attracted the ire of everyone who’s looked closely at it.

So I agree with him that what he terms “parasitic Green OA” — self-archiving alongside the established journal system — is ultimately doomed. The bottom line is that while we as a community continue to give control of our work to the legacy publishers — follow closely here — legacy publishers will control our work. We know that these corporations’ interests are directly opposed to those of authors, science, customers, libraries, and indeed everyone but themselves. So leaving them in control of the scholarly record is unacceptable.

What are our possible futures?

Gold bars

We may find that in ten years’ time, all subscriptions journals are gone (perhaps except from a handful of boutique journals that a few people like, just as a few people prefer the sound of vinyl over CDs or MP3s).

We may find that essentially all new scholarship is published in open-access journals such as those of BioMed Central, PLOS, Frontiers and PeerJ. That is a financially sustainable path, in that publishers will be paid for the services they provide through APCs. (No doubt, volunteer-run and subsidised zero-APC journals will continue to thrive alongside them, as they do today.)

We may even find that some of the Gold OA journals of the future are run by organisations that are presently barrier-based publishers. I don’t think it’s impossible that some rump of Elsevier, Springer et al. will survive the coming subscription-journals crash, and go on to compete on the level playing-field of Gold OA publishing. (I think they will struggle to compete, and certainly won’t be able to make anything like the kind of money they do now, but that’s OK.)

This is the Gold-OA future that Mike Eisen is pinning his hopes on — and which he has done as much as anyone alive to bring into existence. I would be very happy with that outcome.

Kcngb8jBi

While I agree with Eisen that what he terms “parasitic Green” can’t last — legacy publishers will stamp down on it as soon as it starts to be truly useful — I do think there is a possible Green-based future. It just doesn’t involve traditional journals.

One of the striking things about the Royal Society’s recent Future of Scholarly Scientific Communication meetings was that during the day-two breakout session, so many of the groups independently came up with more or less the same proposal. The system that Dorothy Bishop expounded in the Guardian after the second meeting is also pretty similar — and since she wasn’t at the first meeting, I have to conclude that she also came up with it independently, further corroborating the sense that it’s an approach whose time has come.

(In fact, I started drafting an SV-POW! myself at that meeting describing the system that our break-out group came up with. But that was before all the other groups revealed their proposals, and it became apparent that ours was part of a blizzard, rather than a unique snowflake.)

Here are the features characterising the various systems that people came up with. (Not all of these features were in all versions of the system, but they all cropped up more than once.)

  • It’s based around a preprint archive: as with arXiv, authors can publish manuscripts there after only basic editorial checks: is this a legitimate attempt at scholarship, rather than spam or a political opinion?
  • Authors solicit reviews, as we did for for Barosaurus preprint, and interested others can offer unsolicited reviews.
  • Reviewers assign numeric scores to manuscripts as well as giving opinions in prose.
  • The weight given to review scores is affected by the reputation of reviewers.
  • The reputation of reviewers is affected by other users’ judgements about their comments, and also by their reputation as authors.
  • A stable user reputation emerges using a pagerank-like feedback algorithm.
  • Users can acquire reputation by authoring, reviewing or both.
  • Manuscripts have a reputation based on their score.
  • There is no single moment of certification, when a manuscript is awarded a “this is now peer-reviewed” bit.

I think it’s very possible that, instead of the all-Gold future outlined above, we’ll land up with something like this. Not every detail will work out the way I suggested here, of course, but we may well get something along these lines, where the emphasis is on very rapid initial publication and continuously acquired reputation, and not on a mythical and misleading “this paper is peer-reviewed” stamp.

(There are a hundred questions to be asked and answered about such systems: do we want one big system, or a network of many? If the latter, how will they share reputation data? How will the page-rank-like reputation algorithm work? Will it need to be different in different fields of scholarship? I don’t want to get sidetracked by such issues at this point, but I do want to acknowledge that they exist.)

Is this “Green open access”? It’s not what we usually mean by the term; but in as much as it’s about scholars depositing their own work in archives, yes, it’s Green OA in a broader sense.

(I think some confusion arises because we’ve got into the habit of calling deposited manuscripts “preprints”. That’s a misnomer on two counts: they’re not printed, and they needn’t be pre-anything. Manuscripts in arXiv may go onto be published in journals, but that’s not necessary for them to be useful in advancing scholarship.)

green_and_gold_by_n8iveattitude1-d40wtqv

So where now? We have two possible open-access futures, one based on open-access publishing and one based on open-access self-archiving. For myself, I would be perfectly happy with either of these futures — I’m not particularly clear in my own mind which is best, but they’re both enormously better than what we have today.

A case can be made that the Green-based future is maybe a better place to arrive, but that the Gold-based future makes for an easier transition. It doesn’t require researchers to do anything fundamentally different from what they do today, only to do it in open-access journals; whereas the workflow in the Green-based approach outlined above would be a more radical departure. (Ironically, this is the opposite of what has often been said in the past: that the advantage of Green is that it offers a more painless upgrade path for researchers not sold on the importance of OA. That’s only true so long as Green is, in Eisen’s terms, “parasitic” — that is, so long as the repositories contain only second-class versions of papers that have been published conventionally behind paywalls.)

In my own open-access advocacy, then, I’m always unsure whether to push Gold or Green. In my Richard Poynder interview, when asked “What should be the respective roles of Green and Gold OA?” I replied:

This actually isn’t an issue that I get very excited about: Open is so much more important than Green or Gold. I suppose I slightly prefer Gold in that it’s better to have one single definitive version of each article; but then we could do that with Green as well if only we’d stop thinking of it as a stopgap solution while the “real” article remains behind paywalls.

Two and a half years on, I pretty much stand by that (and also by the caveats regarding the RCUK policy’s handing of Gold and Green that followed this quote in the interview.)

But I’m increasingly persuaded that the variety of Green OA that we only get by the grace and favour of the legacy publishers is not a viable long-term strategy. Elsevier’s new regressive policy was always going to come along eventually, and it won’t be the last shot fired in this war. If Green is going to win the world, it will be by pulling away from conventional journals and establishing itself as a valid mode of publication in its own right. (Again, much as arXiv has done.)

1668053_3_201143_252973_4_024

Here’s my concern, though. Paul Royser’s response to Eisen’s post was “Distressing to see the tone and rancor of OA advocates in disagreement. My IR is a “parasite”? Really?” Now, I think that comment was based on a misunderstanding of Eisen’s post (and maybe only on reading the title) but the very fact that such a misunderstanding was possible should give us pause.

Richard Poynder’s reading later in the same thread was also cautionary: “Elsevier will hope that the push back will get side-tracked by in-fighting … I think it will take comfort if the OA movement starts in-fighting instead of pushing back.”

Folks, let’s not fall for that.

We all know that Stevan Harned, among many others, is committed to Green; and that Mike Eisen, among many others, has huge investment in Gold. We can, and should, have rigorous discussions about the strengths and weaknesses of both approaches. We should expect that OA advocates who share the same goal but have different backgrounds will differ over tactics, and sometimes differ robustly.

But there’s a world of difference between differing robustly and differing rancorously. Let’s all (me included) be sure we stay on the right side of that line. Let’s keep it clear in our minds who the enemy is: not people who want to use a different strategy to free scholarship, but those who want to keep it locked up.

And here ends my uncharacteristic attempt to position myself as The Nice, Reasonable One in this discussion — a role much better suited to Peter Suber or Stephen Curry, but it looks like I got mine written first :-)

Somehow this seems to have slipped under the radar: National Science Foundation announces plan for comprehensive public access to research results. They put it up on 18 March, two whole months ago, so our apologies for not having said anything until now!

This is the NSF’s rather belated response to the OSTP memo on Open Access, back in January 2013. This memo required all Federal agencies that spend $100 million in research and development each year to develop OA policies, broadly in line with the existing one of the NIH which gave us PubMed Central. Various agencies have been turning up with policies, but for those of us in palaeo, the NSF’s the big one — I imagine it funds more palaeo research than all the others put together.

So far, so awesome. But what exactly is the new policy? The press release says papers must “be deposited in a public access compliant repository and be available for download, reading and analysis within one year of publication”, but says nothing about what repository should be used. It’s lamentable that a full year’s embargo has been allowed, but at least the publishers’ CHORUS land-grab hasn’t been allowed to hobble the whole thing.

There’s a bit more detail here, but again it’s oddly coy about where the open-access works will be placed: it just says they must be “deposited in a public access compliant repository designated by NSF”. The executive summary of the actual plan also refers only to “a designated repository”

Only in the full 31-page plan itself does the detail emerge. From page 5:

In the initial implementation, NSF has identified the Department of Energy’s PAGES (Public Access Gateway for Energy and Science) system as its designated repository and will require NSF-funded authors to upload a copy of their journal articles or juried conference paper to the DOE PAGES repository in the PDF/A format, an open, non-proprietary standard (ISO 19005-1:2005). Either the final accepted version or the version of record may be submitted. NSF’s award terms already require authors to make available copies of publications to the Cognizant Program Officers as part of the current reporting requirements. As described more fully in Sections 7.8 and 8.2, NSF will extend the current reporting system to enable automated compliance.

Future expansions, described in Section 7.3.1, may provide additional repository services. The capabilities offered by the PAGES system may also be augmented by services offered by third parties.

So what is good and bad about this?

Good. It makes sense to me that they’re re-using an existing system rather than wasting resources and increasing fragmentation by building one of their own.

Bad. It’s a real shame that they mandate the use of PDF, “the hamburger that we want to turn back into a cow”. It’s a terrible format for automated analysis, greatly inferior to the JATS XML format used by PubMed Central. I don’t understand this decision at all.

Then on page 9:

In the initial implementation, NSF has identified the DOE PAGES system to support managing journal articles and juried conference papers. In the future, NSF may add additional partners and repository services in a federated system.

I’m not sure where this points. In an ideal world, it would mean some kind of unifying structure between PAGES and PubMed Central and whatever other repositories the various agencies decide to use.

Anyone else have thoughts?

Update from Peter Suber, later that day

Over on Google+, Peter Suber comments on this post. With his permission, I reproduce his observations here:

My short take on the policy’s weaknesses:

  • will use Dept of Energy PAGES, which at least for DOE is a dark archive pointing to live versions at publisher web sites
  • plans to use CHORUS (p. 13) in addition to DOE PAGES
  • requires PDF
  • silent on open licensing
  • only mentions reuse for data (pp. v, 18), not articles, and only says it will explore reuse
  • silent on reuse for articles even tho it has a license (p. 10) authorizing reuse
  • silent on the timing of deposits

I agree with you that a 12 month embargo is too long. But that’s the White House recommended default. So I blame the White House for this, not NSF.

To be more precise, PAGES favors publisher-controlled OA in one way, and CHORUS does it in another way. Both decisions show the effect of publisher lobbying on the NSF, and its preference for OA editions hosted by publishers, not OA editions hosted by sites independent of publishers.

So all in all, the NSF policy is much less impressive than I’d initially thought and hoped.

Just a quick post today, to refute an incorrect idea about open access that has unfortunately been propagated from time to time. That is the idea that if (say) PLOS were acquired by a barrier-based publisher such as Taylor and Francis, then its papers could be hidden behind paywalls and effectively lost to the world. For example, in Glyn Moody’s article The Open Access Schism, Heather Morrison is quoted as follows:

A major concern about the current move towards CC-BY is that it might allow re-enclosure by companies […] This is a scenario suggested by assistant professor in the School of Information Studies at the University of Ottawa Heather Morrison. As she explains, “There is nothing in the CC BY license that would stop a business from taking all of the works, with attribution, and selling them under a more restrictive license—not only a more restrictive CC-type license (STM’s license is a good indication of what could happen here), but even behind a paywall, then buying out the OA publisher and taking down the OA content.”

This is flatly incorrect.

Reputable open-access publishers not only publish papers on their own sites but also place them in third-party archives, precisely to guard against doomsday scenarios. If (say) PeerJ were made an offer they couldn’t refuse by Elsevier, then the new owners could certainly shut down the PeerJ site; but there’s nothing the could do about the copies of PeerJ articles on PubMed Central, in CLOCKSS and elsewhere. And of course everyone who already has copies of the articles would always be free to distribute them in any way, including posting complete archives on their own websites.

Let’s not accept this kind of scaremongering.

 

Last night, I did a Twitter interview with Open Access Nigeria (@OpenAccessNG). To make it easy to follow in real time, I created a list whose only members were me and OA Nigeria. But because Twitter lists posts in reverse order, and because each individual tweet is encumbered with so much chrome, it’s rather an awkward way to read a sustained argument.

So here is a transcript of those tweets, only lightly edited. They are in bold; I am in regular font. Enjoy!

So @MikeTaylor Good evening and welcome. Twitterville wants to meet you briefly. Who is Mike Taylor?

In real life, I’m a computer programmer with Index Data, a tiny software house that does a lot of open-source programming. But I’m also a researching scientist — a vertebrate palaeontologist, working on sauropods: the biggest and best of the dinosaurs. Somehow I fit that second career into my evenings and weekends, thanks to a very understanding wife (Hi, Fiona!) …

As of a few years ago, I publish all my dinosaur research open access, and I regret ever having let any of my work go behind paywalls. You can find all my papers online, and read much more about them on the blog that I co-write with Matt Wedel. That blog is called Sauropod Vertebra Picture of the Week, or SV-POW! for short, and it is itself open access (CC By)

Sorry for the long answer, I will try to be more concise with the next question!

Ok @MikeTaylor That’s just great! There’s been so much noise around twitter, the orange colour featuring prominently. What’s that about?

Actually, to be honest, I’m not really up to speed with open-access week (which I think is what the orange is all about). I found a while back that I just can’t be properly on Twitter, otherwise it eats all my time. So these days, rather selfishly, I mostly only use Twitter to say things and get into conversations, rather than to monitor the zeitgeist.

That said, orange got established as the colour of open access a long time ago, and is enshrined in the logo:

OAlogo

In the end I suppose open-access week doesn’t hit my buttons too strongly because I am trying to lead a whole open-access life.

… uh, but thanks for inviting me to do this interview, anyway! :-)

You’re welcome @MikeTaylor. So what is open access?

Open Access, or OA, is the term describing a concept so simple and obvious and naturally right that you’d hardly think it needs a name. It just means making the results of research freely available on the Internet for anyone to read, remix and otherwise use.

You might reasonably ask, why is there any other kind of published research other than open access? And the only answer is, historical inertia. For reasons that seemed to make some kind of sense at the time, the whole research ecosystem has got itself locked into this crazy equilibrium where most published research is locked up where almost no-one can see it, and where even the tiny proportion of people who can read published works aren’t allowed to make much use of them.

So to answer the question: the open-access movement is an attempt to undo this damage, and to make the research world sane.

Are there factors perpetuating this inertia you talked about?

Oh, so many factors perpetuting the inertia. Let me list a few …

  1. Old-school researchers who grew up when it was hard to find papers, and don’t see why young whippersnappers should have it easier
  2. Old-school publishers who have got used to making profits of 30-40% turnover (they get content donated to them, then charge subscriptions)
  3. University administrators who make hiring/promotion/tenure decisions based on which old-school journals a researcher’s papers are in.
  4. Feeble politicians who think it’s important to keep the publishing sector profitable, even at the expense of crippling research.

I’m sure there are plenty of others who I’ve overlooked for the moment. I always say regarding this that there’s plenty of blame to go round.

(This, by the way, is why I called the current situation an equilibrium. It’s stable. Won’t fix itself, and needs to be disturbed.)

So these publishers who put scholarly articles behind paywalls online, do they pay the researchers for publishing their work?

HAHAHAHAHAHAHAHAHAHA!

Oh, sorry, please excuse me while I wipe the tears of mirth from my eyes. An academic publisher? Paying an author? Hahahahaha! No.

Not only do academic publishers never pay authors, in many cases they also levy page charges — that is, they charge the authors. So they get paid once by the author, in page-charges, then again by all the libraries that subscribe to read the paywalled papers. Which of course is why, even with their gross inefficiencies, they’re able to make these 30-40% profit margins.

So @MikeTaylor why do many researchers continue to take their work to these restricted access publishers and what can we do about it?

There are a few reasons that play into this together …

Part of it is just habit, especially among more senior researchers who’ve been using the same journals for 20 or 30 years.

But what’s more pernicious is the tendency of academics — and even worse, academic administrators — to evaluate research not by its inherent quality, but by the prestige of the journal that publishes it. It’s just horrifyingly easy for administrators to say “He got three papers out that year, but they were in journals with low Impact Factors.”

Which is wrong-headed on so many levels.

First of all, they should be looking at the work itself, and making an assessment of how well it was done: rigour, clarity, reproducibility. But it’s much easier just to count citations, and say “Oh, this has been cited 50 times, it must be good!” But of course papers are not always cited because they’re good. Sometimes they’re cited precisely because they’re so bad! For example, no doubt the profoundly flawed Arsenic Life paper has been cited many times — by people pointing out its numerous problems.

But wait, it’s much worse than that! Lazy or impatient administrators won’t count how many times a paper has been cited. Instead they will use a surrogate: the Impact Factor (IF), which is a measure not of papers but of journals.

Roughly, the IF measures the average number of citations received by papers that are published in the journal. So at best it’s a measure of journal quality (and a terrible measure of that, too, but let’s not get into that). The real damage is done when the IF is used to evaluate not journals, but the papers that appear in them.

And because that’s so widespread, researchers are often desperate to get their work into journals that have high IFs, even if they’re not OA. So we have an idiot situation where a selfish, rational researcher is best able to advance her career by doing the worst thing for science.

(And BTW, counter-intuitively, the number of citations an individual paper receives is NOT correlated significantly with the journal’s IF. Bjorn Brembs has discussed this extensively, and also shows that IF is correlated with retraction rate. So in many respects the high-IF journals are actually the worst ones you can possibly publish your work in. Yet people feel obliged to.)

*pant* *pant* *pant* OK, I had better stop answering this question, and move on to the next. Sorry to go on so long. (But really! :-) )

This is actually all so enlightening. You just criticised Citation Index along with Impact Factor but OA advocates tend to hold up a higher Citation Index as a reason to publish Open Access. What do you think regarding this?

I think that’s realpolitik. To be honest, I am also kind of pleased that the PLOS journals have pretty good Impact Factors: not because I think the IFs mean anything, but because they make those journals attractive to old-school researchers.

In the same way, it is a well-established fact that open-access articles tend to be cited more than paywalled ones — a lot more, in fact. So in trying to bring people across into the OA world, it makes sense to use helpful facts like these. But they’re not where the focus is.

But the last thing to say about this is that even though raw citation-count is a bad measure of a paper’s quality, it is at least badly measuring the right thing. Evaluating a paper by its journal’s IF is like judging someone by the label of their clothes

So @MikeTaylor Institutions need to stop evaluating research papers based on where they are published? Do you know of any doing it right?

I’m afraid I really don’t know. I’m not privy to how individual institution do things.

All I know is, in some countries (e.g. France) abuse of IF is much more strongly institutionalised. It’s tough for French researchers

What are the various ways researchers can make their work available for free online?

Brilliant, very practical question! There are three main answers. (Sorry, this might go on a bit …)

First, you can post your papers on preprint servers. The best known one is arXiv, which now accepts papers from quite a broad subject range. For example, a preprint of one of the papers I co-wrote with Matt Wedel is freely available on arXiv. There are various preprint servers, including arXiv for physical sciences, bioRxiv, PeerJ Preprints, and SSRN (Social Science Research Network).

You can put your work on a preprint server whatever your subsequent plans are for it — even if (for some reason) it’s going to a paywall. There are only a very few journals left that follow the “Ingelfinger rule” and refuse to publish papers that have been preprinted.

So preprints are option #1. Number 2 is Gold Open Access: publishing in an open-access journal such as PLOS ONE, a BMC journal or eLife. As a matter of principle, I now publish all my own work in open-access journals, and I know lots of other people who do the same — ranging from amateurs like me, via early-career researchers like Erin McKiernan, to lab-leading senior researchers like Michael Eisen.

There are two potential downsides to publishing in an OA journal. One, we already discussed: the OA journals in your field may not be be the most prestigious, so depending on how stupid your administrators are you could be penalised for using an OA journal, even though your work gets cited more than it would have done in a paywalled journal.

The other potential reason some people might want to avoid using an OA journal is because of Article Processing Charges (APC). Because OA publishers have no subscription revenue, one common business model is to charge authors an APC for publishing services instead. APCs can vary wildly, from $0 up to $5000 in the most extreme case (a not-very-open journal run by the AAAS), so they can be offputting.

There are three things to say about APCs.

First, remember that lots of paywalled journals demand page charges, which can cost more!

But second, please know that more than half of all OA journals actually charge no APC at all. They run on different models. For example in my own field, Acta Palaeontologica Polonica and Palaeontologia Electronica are well respected OA journals that charge no APC.

And the third thing is APC waivers. These are very common. Most OA publishers have it as a stated goal that no-one should be prevented from publishing with them by lack of funds for APCs. So for example PLOS will nearly always give a waiver when requested. Likewise Ubiquity, and others.

So there are lots of ways to have your work appear in an OA journal without paying for it to be there.

Anyway, all that was about the second way to make your work open access. #1 was preprints, #2 is “Gold OA” in OA journals …

And #3 is “Green OA”, which means publishing in a paywalled journal, but depositing a copy of the paper in an open repository. The details of how this works can be a bit complicated: different paywall-based publishers allow you to do different things, e.g. it’s common to say “you can deposit your peer-reviewed, accepted but unformatted manuscript, but only after 12 months“.

Opinions vary as to how fair or enforceable such rules are. Some OA advocates prefer Green. Others (including me) prefer Gold. Both are good.

See this SV-POW! post on the practicalities of negotiating Green OA if you’re publishing behind a paywall.

So to summarise:

  1. Deposit preprints
  2. Publish in an OA journal (getting a fee waiver if needed)
  3. Deposit postprints

I’ve written absolutely shedloads on these subjects over the last few years, including this introductory batch. If you only read one of my pieces about OA, make it this one: The parable of the farmers & the Teleporting Duplicator.

Last question – Do restricted access publishers pay remuneration to peer reviewers?

I know of no publisher that pays peer reviewers. But actually I am happy with that. Peer-review is a service to the community. As soon as you encumber it with direct financial incentives, things get more complicated and there’s more potential for Conflict of interest. What I do is, I only perform peer-reviews for open-access journals. And I am happy to put that time/effort in knowing the world will benefit.

And so we bring this edition to a close. We say a big thanks to our special guest @MikeTaylor who’s been totally awesome and instructive.

Thanks, it’s been a privilege.

[NOTE: see the updates at the bottom. In summary, there’s nothing to see here and I was mistaken in posting this in the first place.]

Elsevier’s War On Access was stepped up last year when they started contacting individual universities to prevent them from letting the world read their research. Today I got this message from a librarian at my university:

babys-first-takedown

The irony that this was sent from the Library’s “Open Access Team” is not lost on me. Added bonus irony: this takedown notification pertains to an article about how openness combats mistrust and secrecy. Well. You’d almost think NPG wants mistrust and secrecy, wouldn’t you?

It’s sometimes been noted that by talking so much about Elsevier on this blog, we can appear to be giving other barrier-based publishers a free ride. If we give that impression, it’s not deliberate. By initiating this takedown, Nature Publishing Group has self-identified itself as yet another so-called academic publisher that is in fact an enemy of science.

So what next? Anyone who wants a PDF of this (completely trivial) letter can still get one very easily from my own web-site, so in that sense no damage has been done. But it does leave me wondering what the point of the Institutional Repository is. In practice it seems to be a single point of weakness allowing “publishers” to do the maximum amount of damage with a single attack.

But part of me thinks the thing to do is take the accepted manuscript and format it myself in the exact same way as Nature did, and post that. Just because I can. Because the bottom line is that typesetting is the only actual service they offered Andy, Matt and me in exchange for our right to show our work to the world, and that is a trivial service.

The other outcome is that this hardens my determination never to send anything to Nature again. Now it’s not like my research program is likely to turn up tabloid-friendly results anyway, so this is a bit of a null resolution. But you never know: if I happen to stumble across sauropod feather impressions in an overlooked Wealden fossil, then that discovery is going straight to PeerJ, PLOS, BMC, F1000 Research, Frontiers or another open-access publisher, just like all my other work.

And that’s sheer self-interest at work there, just as much as it’s a statement. I will not let my best work be hidden from the world. Why would anyone?

Let’s finish with another outing for this meme-ready image.

Publishers ... You're doing it wrong

Update (four hours later)

David Mainwaring (on Twitter) and James Bisset (in the comment below) both pointed out that I’ve not seen an actual takedown request from NPG — just the takedown notification from my own library. I assumed that the library were doing this in response to hassle from NPG, but of course it’s possible that my own library’s Open Access Team is unilaterally trying to prevent access to the work of its university’s researchers.

I’ve emailed Lyn Duffy to ask for clarification. In the mean time, NPG’s Grace Baynes has tweeted:

So it looks like this may be even more bizarre than I’d realised.

Further bulletins as events warrant.

Update 2 (two more hours later)

OK, consensus is that I read this completely wrong. Matt’s comment below says it best:

I have always understood institutional repositories to be repositories for author’s accepted manuscripts, not for publisher’s formatted versions of record. By that understanding, if you upload the latter, you’re breaking the rules, and basically pitting the repository against the publisher.

Which is, at least, not a nice thing to do to the respository.

So the conclusion is: I was wrong, and there’s nothing to see here apart from me being embarrassed. That’s why I’ve struck through much of the text above. (We try not to actually delete things from this blog, to avoid giving a false history.)

My apologies to Lyn Duffy, who was just doing her job.

Update 3 (another hour later)

This just in from Lyn Duffy, confirming that, as David and James guessed, NPG did not send a takedown notice:

Dear Mike,

This PDF was removed as part of the standard validation work of the Open Access team and was not prompted by communication from Nature Publishing. We validate every full-text document that is uploaded to Pure to make sure that the publisher permits posting of that version in an institutional repository. Only after validation are full-text documents made publicly available.

In this case we were following the regulations as stated in the Nature Publishing policy about confidentiality and pre-publicity. The policy says, ‘The published version — copyedited and in Nature journal format — may not be posted on any website or preprint server’ (http://www.nature.com/authors/policies/confidentiality.html). In the information for authors about ‘Other material published in Nature’ it says, ‘All articles for all sections of Nature are considered according to our usual conditions of publication’ (http://www.nature.com/nature/authors/gta/others.html#correspondence). We took this to mean that material such as correspondence have the same posting restrictions as other material published by Nature Publishing.

If we have made the wrong decision in this case and you do have permission from Nature Publishing to make the PDF of your correspondence publicly available via an institutional repository, we can upload the PDF to the record.

Kind regards,
Open Access Team

Appendix

Here’s the text of the original notification email so search-engines can pick it up. (If you read the screen-grab above, you can ignore this.)

University of Bristol — Pure

Lyn Duffy has added a comment

Sharing: public databases combat mistrust and secrecy
Farke, A. A., Taylor, M. P. & Wedel, M. J. 22 Oct 2009 In : Nature. 461, 7267, p. 1053

Research output: Contribution to journal › Article

Lyn Duffy has added a comment 7/05/14 10:23

Dear Michael, Apologies for the delay in checking your record. It appears that the document you have uploaded alongside this record is the publishers own version/PDF and making this version openly accessible in Pure is prohibited by the publisher, as a result the document has been removed from the record. In this particular instance the publisher would allow you to make accessible the postprint version of the paper, i.e., the article in the form accepted for publication in the journal following the process of peer review. Please upload an acceptable version of the paper if you have one. If you have any questions about this please get back to us, or send an email directly to open-access@bristol.ac.uk Kind regards, Lyn Duffy Library Open Access Team.

This morning sees the publication of the new Policy for open access in the post-2014 Research Excellence Framework from HEFCE, the Higher Education Funding Council for England. It sets out in details HEFCE’s requirement that papers must be open-access to be eligible for the next (post-2014) Research Excellence Framework (REF).

Here is the core of it, quoted direct from the Executive Summary:

The policy states that, to be eligible for submission to the post-2014 REF, authors’ final peer-reviewed manuscripts must have been deposited in an institutional or subject repository on acceptance for publication. Deposited material should be discoverable, and free to read and download, for anyone with an internet connection […]  The policy applies to research outputs accepted for publication after 1 April 2016, but we would strongly urge institutions to implement it now.

There are lots of ifs, buts and maybes, but overall this is excellent news, and solid confirmation that the UK really is committed to an open-access transition. Before we go into those caveats, let’s take a moment to applaud the real, significant progress that this policy represents. For the first time ever, universities’ funding levels, and so individual academics’ careers, will be directly tied to the openness of their output. Congratulations to HEFCE!

celebrate-2

Also commendable: the actual policy document is very carefully written, and includes details such as “Outputs whose text is encoded only as a scanned image do not meet the requirement that the text be searchable electronically.” It’s evident that a lot of careful thought has gone into this.

Now for those caveats:

The policy will not apply to monographs, book chapters, other long-form publications, working papers, creative or practice-based research outputs, or data.

This is a shame, but understandable, especially in the case of books. I would have hoped that chapters within edited volumes would have been included. But the main document notes that “Where a higher education institution (HEI) can demonstrate that it has taken steps towards enabling open access for outputs outside the scope of this definition, credit will be given in the research environment component of the post-2014 REF.”

Next disappointment:

The policy allows repositories to respect embargo periods set by publications. Where a publication specifies an embargo period, authors can comply with the policy by making a ‘closed’ deposit on acceptance. Closed deposits must be discoverable to anyone with an Internet connection before the full text becomes available for read and download (which will occur after the embargo period has elapsed). Closed deposits will be admissible to the REF.

I would of course have wanted all embargo periods to be eliminated, or at the very least capped at six months as in the old, pre-watering-down, RCUK policy. But that was too much to hope for in the political environment that publishers have somehow managed to create.

More positively, it’s a good sop that deposit must be made on acceptance — not when the embargo expires, or even on publication, but on acceptance. These “closed deposits” are like a formal promise of openness, with an automated implementation. We don’t have good experimental data on this, but it seems likely that this approach will result in much better compliance rates than just telling authors “you have to come back six to 24 months after publication and make a deposit”.

Third disappointment:

There are a number of exceptions to the various requirements that will be automatically allowed by the policy. These exceptions cover circumstances where deposit was not possible, or where open access to deposited material could not be achieved within the policy requirements. These exceptions will allow institutions to achieve near-total compliance, but the post-2014 REF will also include a mechanism for considering any other exceptional cases where an output could not otherwise meet the requirements.

The exceptions encourage weasel-wordage, of course, and some of the specific exceptions listed in Appendix C are particularly weak: “Author was unable to secure the use of a repository”, “Publication is print-only (no electronic version)”, and the lamentable “Publication does not offer a compliant green or gold option”, which really means “HEFCE authors should not be using this publication”.

But when you read into the details, this approach with specific exceptions is actually rather better than the alternative that had been on the table: a percentage-based target, where some specific proportion of REF submissions would need to be open access. Instead of saying “80% of submissions must be open access” (or some other percentage), HEFCE is saying that it wants them all to be open access except where a specific excuse is given. I’d like them to be much less accommodating with what excuses they’ll accept, but the important thing here is that they have set the default to open.

Now for the most regrettable part of the policy:

While we do  not request that outputs are made available under any particular licence, we advise that outputs licensed under a Creative Commons Attribution Non-Commercial Non-Derivative (CC BY-NC-ND) licence would meet this requirement.

I won’t rehearse again all the reasons that Non-Commercial and No-Derivatives clauses are poison, I’ll just note that works published under this licence are not open access according to the original definition of that term, which allows us to “use [OA works] for any other lawful purpose, without financial, legal, or technical barriers”.

Yet even here, the general tenor of the policy is positive. While it accepts NC-ND, the policy adds that “where an HEI can demonstrate that outputs are presented in a form that allows re-use of the work, including via text-mining, credit will be given in the research environment component of the post-2014 REF”.

One last observation: HEFCE should be commended on having provided an excellent, detailed explanation of feedback they received to their consultations. As always, reading such documents can be frustrating because they necessarily contain some views very different from mine; but it’s useful to see the range of opinions laid out so explicitly.

No open-access policy document I’ve ever seen has been perfect, and this one is no exception. But overall, the HEFCE open-access policy is a significant and welcome step forward, and carries the promise of further positive moves in the future.