Public access to results of NSF-funded research

May 19, 2015

Somehow this seems to have slipped under the radar: National Science Foundation announces plan for comprehensive public access to research results. They put it up on 18 March, two whole months ago, so our apologies for not having said anything until now!

This is the NSF’s rather belated response to the OSTP memo on Open Access, back in January 2013. This memo required all Federal agencies that spend $100 million in research and development each year to develop OA policies, broadly in line with the existing one of the NIH which gave us PubMed Central. Various agencies have been turning up with policies, but for those of us in palaeo, the NSF’s the big one — I imagine it funds more palaeo research than all the others put together.

So far, so awesome. But what exactly is the new policy? The press release says papers must “be deposited in a public access compliant repository and be available for download, reading and analysis within one year of publication”, but says nothing about what repository should be used. It’s lamentable that a full year’s embargo has been allowed, but at least the publishers’ CHORUS land-grab hasn’t been allowed to hobble the whole thing.

There’s a bit more detail here, but again it’s oddly coy about where the open-access works will be placed: it just says they must be “deposited in a public access compliant repository designated by NSF”. The executive summary of the actual plan also refers only to “a designated repository”

Only in the full 31-page plan itself does the detail emerge. From page 5:

In the initial implementation, NSF has identified the Department of Energy’s PAGES (Public Access Gateway for Energy and Science) system as its designated repository and will require NSF-funded authors to upload a copy of their journal articles or juried conference paper to the DOE PAGES repository in the PDF/A format, an open, non-proprietary standard (ISO 19005-1:2005). Either the final accepted version or the version of record may be submitted. NSF’s award terms already require authors to make available copies of publications to the Cognizant Program Officers as part of the current reporting requirements. As described more fully in Sections 7.8 and 8.2, NSF will extend the current reporting system to enable automated compliance.

Future expansions, described in Section 7.3.1, may provide additional repository services. The capabilities offered by the PAGES system may also be augmented by services offered by third parties.

So what is good and bad about this?

Good. It makes sense to me that they’re re-using an existing system rather than wasting resources and increasing fragmentation by building one of their own.

Bad. It’s a real shame that they mandate the use of PDF, “the hamburger that we want to turn back into a cow”. It’s a terrible format for automated analysis, greatly inferior to the JATS XML format used by PubMed Central. I don’t understand this decision at all.

Then on page 9:

In the initial implementation, NSF has identified the DOE PAGES system to support managing journal articles and juried conference papers. In the future, NSF may add additional partners and repository services in a federated system.

I’m not sure where this points. In an ideal world, it would mean some kind of unifying structure between PAGES and PubMed Central and whatever other repositories the various agencies decide to use.

Anyone else have thoughts?

Update from Peter Suber, later that day

Over on Google+, Peter Suber comments on this post. With his permission, I reproduce his observations here:

My short take on the policy’s weaknesses:

  • will use Dept of Energy PAGES, which at least for DOE is a dark archive pointing to live versions at publisher web sites
  • plans to use CHORUS (p. 13) in addition to DOE PAGES
  • requires PDF
  • silent on open licensing
  • only mentions reuse for data (pp. v, 18), not articles, and only says it will explore reuse
  • silent on reuse for articles even tho it has a license (p. 10) authorizing reuse
  • silent on the timing of deposits

I agree with you that a 12 month embargo is too long. But that’s the White House recommended default. So I blame the White House for this, not NSF.

To be more precise, PAGES favors publisher-controlled OA in one way, and CHORUS does it in another way. Both decisions show the effect of publisher lobbying on the NSF, and its preference for OA editions hosted by publishers, not OA editions hosted by sites independent of publishers.

So all in all, the NSF policy is much less impressive than I’d initially thought and hoped.

12 Responses to “Public access to results of NSF-funded research”

  1. David Wojick Says:

    I track the US Public Access program at http://insidepublicaccess.com/. Much of the NSF policy and program has yet to be defined, or even funded, so the plan is largely a discussion document. What it ultimately looks like may depend on how it is funded internally.

    DOE PAGES is designed to use CHORUS and Fred Dylla (father of CHORUS) says NSF intends to use it in the fullness of time. Even DOE is having trouble using it in these early stages. One technical challenge is communicating the publisher’s license terms at the article level. The Feds only claim a Federal use license to the accepted manuscript, which does not include reuse.

    NSF says it might also use SHARE down the road, which would mean pointing users to articles in institutional repositories. This is unlikely in my view because SHARE is not being designed to collect funder data in a systematic fashion, making it relatively useless for the Public Access program. SHARE began as a response to Public Access but has since wandered away, as it were, becoming instead a general notification service for so-called “research events.”

  2. Mike Taylor Says:

    Thanks, David, this is very helpful.

  3. David Wojick Says:

    My pleasure, Mike. Happy to answer questions about NSF or any of the Federal agency Public Access programs, no two of which are alike. Bit of a mess really, building a bunch of complex new systems with no new money to fund them.

  4. Mike Taylor Says:

    Bit of a mess really, building a bunch of complex new systems with no new money to fund them.

    Hence my most fundamental question about all this. Since PubMed Central is already there, why don’t they just buy a couple more servers, rename it GovMed Central or something, and have all the agencies use that?

  5. David Wojick Says:

    I coined the term PubFed Central for that solution. Cost is one reason and jealousy another, but the big one is that DOE decided that linking to the publishers was a better science policy. There was a big fight over this during the lengthy interagency working group process that preceded the OSTP memo.

    As things stand all the HHS Department health agencies are going to use PMC, plus NASA says it will. But the NASA plan has not been funded and PMC is expensive so time will tell. NSF has gone with DOE and together they fund most of the Federal physical and computer science research, making PAGES the portal for those fields. USDA it trying to build its own system, while DOD is adapting its DTIC system, but neither has been funded. Word has it that USDA will do a CHORUS pilot and DOD may too.

    This dance is a long way from over, in fact it is just beginning.

  6. Mike Taylor Says:

    How ridiculous.

    Thanks for the inside track, though.

  7. David Wojick Says:

    I predicted this mess when the OSTP memo first came out —
    http://scholarlykitchen.sspnet.org/2013/02/25/confusions-in-the-ostp-oa-policy-memo-three-monsters-and-a-gorilla/. Each agency is a kingdom.

    Mind you I prefer PAGES to PMC. I was a consultant to DOE during the interagency working group battles so helped formulate their policy.

  8. David Wojick Says:

    What exactly do you find ridiculous, Mike?

  9. Mike Taylor Says:

    The thing I find ridiculous is that a dozen agencies who manuscript-archiving requirements are essentially identical can’t just all use the same archive.

  10. David Wojick Says:

    Why don’t all the universities use the same archive? For the same reasons. The agencies with big library and document collection systems are using those systems. Plus there a big policy differences among the agencies. OSTP could have funded a government wide system but they chose to fund nothing, hence the mess.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: