June 11, 2015
We as a community often ask ourselves how much it should cost to publish an open-access paper. (We know how much it does cost, roughly: typically $3000 with a legacy publisher, or an average of $900 with a born-open publisher, or nothing at all for many journals.)
We know that peer-review is essentially free to publishers, being donated free by scholars. We know that most handling editors also work for free or for peanuts. We know that hosting things on the Web is cheap (“publishing [in this sense] is just a button“).
Publishers have costs associated with rejecting manuscripts — checking that they’re by real people at real institutions, scanning for obvious pseudo-scholarship, etc. But let’s ignore those costs for now, as being primarily for the benefit of the publishers rather than the author. (When I pay a publisher an APC, they’re not serving me directly by running plagiarism checks.)
The tendency of many discussions I’ve been involved with has been that the main technical contribution of publishers is the process that is still, for historical reasons, known as “typesetting” — that is, the transformation of the manuscript from from an opaque form like an MS-Word file (or indeed a stack of hand-written sheets) into a semantically rich representation such as JATS XML. From there, actual typesetting into HTML or a pretty PDF can be largely automated.
So: what does it cost to typeset a manuscript?
First data point: I have heard that Kaveh Bazargan’s River Valley Technologies (the typesetter that PeerJ and many more mainstream publishers use) charges between £3.50 and £9 per page, including XML, graphics, PDF generation and proof correction.
Second data point: in a Scholarly Kitchen post that Kent Anderson intended as a criticism of PubMed Central but which in fact makes a great case for what good value it provides, he quotes an email from Kent A. Smith, a former Deputy Director of the NLM:
Under the % basis I am using here $47 per article. John [Mullican, a program analyst at NCBI] and I looked at this yesterday and based the number on a sampling of a few months billings. It consists on the average of about $34-35 per tagged article plus $10-11 for Q/A plus administrative fees of $2-3, where applicable.
Using the quoted figure of $47 per PMC article and the £6.25 midpoint of River Valley’s range of per-page prices (= $9.68 per page), that would be consistent with typical PMC articles being a bit under five pages long. The true figure is probably somewhat higher — maybe twice as long or more — but this seems to be at least in the same ballpark.
Third data point: Charles H. E. Ault, in a comment on that Scholarly Kitchen post, wrote:
As a production director at a small-to-middling university press that publishes no journals, I’m a bit reluctant to jump into this fray. But I must say that I am astonished at how much PMC is paying for XML tagging. Most vendors looking for the small amount of business my press can offer (say, maybe 10,000 pages a year at most) charge considerably less than $0.50 per page for XML tagging. Assuming a journal article is about 30 pages long, it should cost no more than $15 for XML tagging. Add another few bucks for quality assurance, and you might cross the $20 threshold. Does PMC have to pay a federally mandated minimum rate, like bridge construction projects? Where can I submit a bid?
I find the idea of 50-cent-per-page typesetting hard to swallow — it’s more than an order of magnitude cheaper than the River Valley/PMC level, and I’d like to know more about Ault’s operation. Is what they’re doing really comparable with what the others are doing?
Are there other estimates out there?
May 26, 2015
Provoked by Mike Eisen’s post today, The inevitable failure of parasitic green open access, I want to briefly lay out the possible futures of scholarly publishing as I see them. There are two: one based on what we now think of as Gold OA, and one on what we think of as Green OA.
Eisen is of course quite right that the legacy publishers only ever gave their blessing to Green OA (self-archiving) so long as they didn’t see it as a threat, so the end of that blessing isn’t a surprise. (I think it’s this observation that Richard Poynder misread as “an OA advocate legitimising Elsevier’s action”!) It was inevitable that this blessing would be withdrawn as Green started to become more mainstream — and that’s exactly what we’ve seen, with Elsevier responding to the global growth in Green OA mandates with a regressive new policy that has rightly attracted the ire of everyone who’s looked closely at it.
So I agree with him that what he terms “parasitic Green OA” — self-archiving alongside the established journal system — is ultimately doomed. The bottom line is that while we as a community continue to give control of our work to the legacy publishers — follow closely here — legacy publishers will control our work. We know that these corporations’ interests are directly opposed to those of authors, science, customers, libraries, and indeed everyone but themselves. So leaving them in control of the scholarly record is unacceptable.
What are our possible futures?
We may find that in ten years’ time, all subscriptions journals are gone (perhaps except from a handful of boutique journals that a few people like, just as a few people prefer the sound of vinyl over CDs or MP3s).
We may find that essentially all new scholarship is published in open-access journals such as those of BioMed Central, PLOS, Frontiers and PeerJ. That is a financially sustainable path, in that publishers will be paid for the services they provide through APCs. (No doubt, volunteer-run and subsidised zero-APC journals will continue to thrive alongside them, as they do today.)
We may even find that some of the Gold OA journals of the future are run by organisations that are presently barrier-based publishers. I don’t think it’s impossible that some rump of Elsevier, Springer et al. will survive the coming subscription-journals crash, and go on to compete on the level playing-field of Gold OA publishing. (I think they will struggle to compete, and certainly won’t be able to make anything like the kind of money they do now, but that’s OK.)
This is the Gold-OA future that Mike Eisen is pinning his hopes on — and which he has done as much as anyone alive to bring into existence. I would be very happy with that outcome.
While I agree with Eisen that what he terms “parasitic Green” can’t last — legacy publishers will stamp down on it as soon as it starts to be truly useful — I do think there is a possible Green-based future. It just doesn’t involve traditional journals.
One of the striking things about the Royal Society’s recent Future of Scholarly Scientific Communication meetings was that during the day-two breakout session, so many of the groups independently came up with more or less the same proposal. The system that Dorothy Bishop expounded in the Guardian after the second meeting is also pretty similar — and since she wasn’t at the first meeting, I have to conclude that she also came up with it independently, further corroborating the sense that it’s an approach whose time has come.
(In fact, I started drafting an SV-POW! myself at that meeting describing the system that our break-out group came up with. But that was before all the other groups revealed their proposals, and it became apparent that ours was part of a blizzard, rather than a unique snowflake.)
Here are the features characterising the various systems that people came up with. (Not all of these features were in all versions of the system, but they all cropped up more than once.)
- It’s based around a preprint archive: as with arXiv, authors can publish manuscripts there after only basic editorial checks: is this a legitimate attempt at scholarship, rather than spam or a political opinion?
- Authors solicit reviews, as we did for for Barosaurus preprint, and interested others can offer unsolicited reviews.
- Reviewers assign numeric scores to manuscripts as well as giving opinions in prose.
- The weight given to review scores is affected by the reputation of reviewers.
- The reputation of reviewers is affected by other users’ judgements about their comments, and also by their reputation as authors.
- A stable user reputation emerges using a pagerank-like feedback algorithm.
- Users can acquire reputation by authoring, reviewing or both.
- Manuscripts have a reputation based on their score.
- There is no single moment of certification, when a manuscript is awarded a “this is now peer-reviewed” bit.
I think it’s very possible that, instead of the all-Gold future outlined above, we’ll land up with something like this. Not every detail will work out the way I suggested here, of course, but we may well get something along these lines, where the emphasis is on very rapid initial publication and continuously acquired reputation, and not on a mythical and misleading “this paper is peer-reviewed” stamp.
(There are a hundred questions to be asked and answered about such systems: do we want one big system, or a network of many? If the latter, how will they share reputation data? How will the page-rank-like reputation algorithm work? Will it need to be different in different fields of scholarship? I don’t want to get sidetracked by such issues at this point, but I do want to acknowledge that they exist.)
Is this “Green open access”? It’s not what we usually mean by the term; but in as much as it’s about scholars depositing their own work in archives, yes, it’s Green OA in a broader sense.
(I think some confusion arises because we’ve got into the habit of calling deposited manuscripts “preprints”. That’s a misnomer on two counts: they’re not printed, and they needn’t be pre-anything. Manuscripts in arXiv may go onto be published in journals, but that’s not necessary for them to be useful in advancing scholarship.)
So where now? We have two possible open-access futures, one based on open-access publishing and one based on open-access self-archiving. For myself, I would be perfectly happy with either of these futures — I’m not particularly clear in my own mind which is best, but they’re both enormously better than what we have today.
A case can be made that the Green-based future is maybe a better place to arrive, but that the Gold-based future makes for an easier transition. It doesn’t require researchers to do anything fundamentally different from what they do today, only to do it in open-access journals; whereas the workflow in the Green-based approach outlined above would be a more radical departure. (Ironically, this is the opposite of what has often been said in the past: that the advantage of Green is that it offers a more painless upgrade path for researchers not sold on the importance of OA. That’s only true so long as Green is, in Eisen’s terms, “parasitic” — that is, so long as the repositories contain only second-class versions of papers that have been published conventionally behind paywalls.)
In my own open-access advocacy, then, I’m always unsure whether to push Gold or Green. In my Richard Poynder interview, when asked “What should be the respective roles of Green and Gold OA?” I replied:
This actually isn’t an issue that I get very excited about: Open is so much more important than Green or Gold. I suppose I slightly prefer Gold in that it’s better to have one single definitive version of each article; but then we could do that with Green as well if only we’d stop thinking of it as a stopgap solution while the “real” article remains behind paywalls.
Two and a half years on, I pretty much stand by that (and also by the caveats regarding the RCUK policy’s handing of Gold and Green that followed this quote in the interview.)
But I’m increasingly persuaded that the variety of Green OA that we only get by the grace and favour of the legacy publishers is not a viable long-term strategy. Elsevier’s new regressive policy was always going to come along eventually, and it won’t be the last shot fired in this war. If Green is going to win the world, it will be by pulling away from conventional journals and establishing itself as a valid mode of publication in its own right. (Again, much as arXiv has done.)
Here’s my concern, though. Paul Royser’s response to Eisen’s post was “Distressing to see the tone and rancor of OA advocates in disagreement. My IR is a “parasite”? Really?” Now, I think that comment was based on a misunderstanding of Eisen’s post (and maybe only on reading the title) but the very fact that such a misunderstanding was possible should give us pause.
Richard Poynder’s reading later in the same thread was also cautionary: “Elsevier will hope that the push back will get side-tracked by in-fighting … I think it will take comfort if the OA movement starts in-fighting instead of pushing back.”
Folks, let’s not fall for that.
We all know that Stevan Harned, among many others, is committed to Green; and that Mike Eisen, among many others, has huge investment in Gold. We can, and should, have rigorous discussions about the strengths and weaknesses of both approaches. We should expect that OA advocates who share the same goal but have different backgrounds will differ over tactics, and sometimes differ robustly.
But there’s a world of difference between differing robustly and differing rancorously. Let’s all (me included) be sure we stay on the right side of that line. Let’s keep it clear in our minds who the enemy is: not people who want to use a different strategy to free scholarship, but those who want to keep it locked up.
And here ends my uncharacteristic attempt to position myself as The Nice, Reasonable One in this discussion — a role much better suited to Peter Suber or Stephen Curry, but it looks like I got mine written first :-)
May 19, 2015
Somehow this seems to have slipped under the radar: National Science Foundation announces plan for comprehensive public access to research results. They put it up on 18 March, two whole months ago, so our apologies for not having said anything until now!
This is the NSF’s rather belated response to the OSTP memo on Open Access, back in January 2013. This memo required all Federal agencies that spend $100 million in research and development each year to develop OA policies, broadly in line with the existing one of the NIH which gave us PubMed Central. Various agencies have been turning up with policies, but for those of us in palaeo, the NSF’s the big one — I imagine it funds more palaeo research than all the others put together.
So far, so awesome. But what exactly is the new policy? The press release says papers must “be deposited in a public access compliant repository and be available for download, reading and analysis within one year of publication”, but says nothing about what repository should be used. It’s lamentable that a full year’s embargo has been allowed, but at least the publishers’ CHORUS land-grab hasn’t been allowed to hobble the whole thing.
There’s a bit more detail here, but again it’s oddly coy about where the open-access works will be placed: it just says they must be “deposited in a public access compliant repository designated by NSF”. The executive summary of the actual plan also refers only to “a designated repository”
Only in the full 31-page plan itself does the detail emerge. From page 5:
In the initial implementation, NSF has identified the Department of Energy’s PAGES (Public Access Gateway for Energy and Science) system as its designated repository and will require NSF-funded authors to upload a copy of their journal articles or juried conference paper to the DOE PAGES repository in the PDF/A format, an open, non-proprietary standard (ISO 19005-1:2005). Either the final accepted version or the version of record may be submitted. NSF’s award terms already require authors to make available copies of publications to the Cognizant Program Officers as part of the current reporting requirements. As described more fully in Sections 7.8 and 8.2, NSF will extend the current reporting system to enable automated compliance.
Future expansions, described in Section 7.3.1, may provide additional repository services. The capabilities offered by the PAGES system may also be augmented by services offered by third parties.
So what is good and bad about this?
Good. It makes sense to me that they’re re-using an existing system rather than wasting resources and increasing fragmentation by building one of their own.
Bad. It’s a real shame that they mandate the use of PDF, “the hamburger that we want to turn back into a cow”. It’s a terrible format for automated analysis, greatly inferior to the JATS XML format used by PubMed Central. I don’t understand this decision at all.
Then on page 9:
In the initial implementation, NSF has identified the DOE PAGES system to support managing journal articles and juried conference papers. In the future, NSF may add additional partners and repository services in a federated system.
I’m not sure where this points. In an ideal world, it would mean some kind of unifying structure between PAGES and PubMed Central and whatever other repositories the various agencies decide to use.
Anyone else have thoughts?
Over on Google+, Peter Suber comments on this post. With his permission, I reproduce his observations here:
My short take on the policy’s weaknesses:
- will use Dept of Energy PAGES, which at least for DOE is a dark archive pointing to live versions at publisher web sites
- plans to use CHORUS (p. 13) in addition to DOE PAGES
- requires PDF
- silent on open licensing
- only mentions reuse for data (pp. v, 18), not articles, and only says it will explore reuse
- silent on reuse for articles even tho it has a license (p. 10) authorizing reuse
- silent on the timing of deposits
I agree with you that a 12 month embargo is too long. But that’s the White House recommended default. So I blame the White House for this, not NSF.
To be more precise, PAGES favors publisher-controlled OA in one way, and CHORUS does it in another way. Both decisions show the effect of publisher lobbying on the NSF, and its preference for OA editions hosted by publishers, not OA editions hosted by sites independent of publishers.
So all in all, the NSF policy is much less impressive than I’d initially thought and hoped.
In response to my post Copyright from the lens of reality and other rebuttals of his original post, Elseviers General Counsel Mark Seeley has provided a lengthy comment. Here’s my response (also posted as a comment on the original article, but I’m waiting for it to be moderated.)
Hi, Mark, thanks for engaging. You write:
With respect to the societal bargain, I would simply note that, in my view, the framers believed that by providing rights they would encourage creative works, and that this benefits society as a whole.
Here, at least, we are in complete agreement. Where we part company is that in my view the Eldred v. Ashcroft decision (essentially that copyright terms can be increased indefinitely) was a travesty of the original intent of copyright, and clearly intended for the benefit of copyright holders rather than that of society on general. (I further note in passing that those copyright holders are only rarely the creative people, but rights-holding corporations whose creative contribution is negligible.)
[Journal] services and competencies need to be supported through a business model, however, and in the mixed economy that we have at the moment, this means that many journals will continue to need subscription and purchase models.
This is a circular argument. It comes down to “we use restrictive copyright on scholarly works at present, so we therefore need to continue to do so”. In fact, this this is not an argument at all, merely an assertion. If you want it to stick, you need to demonstrate that the present “mixed economy” is a good thing — something that is very far from evident.
The alternatives to a sound business model rooted in copyright are in my view unsustainable. I worry about government funding, patronage from foundations, or funding by selling t-shirts—I am not sure that these are viable, consistent or durable. Governments and foundations can change their priorities, for example.
If governments and foundations decide to stop funding research, we’re all screwed, and retention of copyright on the papers we’re no longer able to research and write will be the least of our problems. The reality is that virtually everyone in research is already dependent on governments and foundations for the 99% of their funding that covers all the work before the final step of publication. Taking the additional step of relying on those same sources for the last 1% of funding is eminently sensible.
On Creative Commons licences, I don’t think we have any material disagreement.
Now we come to the crucial question of copyright terms (already alluded to via Eldred v. Ashcroft above). You content:
Copyright law was most likely an important spur for the author or publisher to produce and distribute the work [that is now in the public domain] in the first place.
In principle, I agree — as of course did the framers of the US Constitution and other lawmakers that have passed copyright laws. But as you will well know, the US’s original copyright act of 1790, which stated its purpose as “encouragement of learning”, offered a term of 14 years, with an optional renewal of a further 14 years if the author was still alive at the end of the initial term. This 14-year was considered quite sufficient to incentivise the creation of new works. The intent of the present law seems to be that authors who have been dead for 70 years still need to receive royalties for their works, and in the absence of such royalties would not have created in the first place. This is self-evident nonsense. No author in the history of the world every said “I would have written a novel if I’d continued to receive royalties until 70 years after my death, but since royalties will only last 28 years I’m not going to bother”.
But — and this can’t be stated strongly enough — even if there were some justification for the present ridiculous copyright terms in the area of creative works, it would still say nothing whatsoever about the need to copyright scientific writing. No scientific researcher ever wrote a paper who would not have written it in the absence of copyright. That’s what we’re talking about here. One of the tragedies of copyright is that it’s been extruded from a domain where it has some legitimate purpose into a domain where it has none.
The Budapest Open Access Initiative said it best and most clearly: “the only role for copyright in this domain [scholarly research] should be to give authors control over the integrity of their work and the right to be properly acknowledged and cited“. (And several of the BOAI signatories have expressed regret over even the controlling-integrity-of-the-work part of this.)
See also David Roberts’ response to Seeley’s posting.
May 7, 2015
This post is a response to Copyright from the lens of a lawyer (and poet), posted a couple of days ago by Elsevier’s General Counsel, Mark Seeley. Yes, I am a slave to SIWOTI syndrome. No, I shouldn’t be wasting my time responding to this. Yes, I ought to be working on that exciting new manuscript that we SV-POW!er Rangers have up and running. But but but … I can’t just let this go.
Copyright from the lens of a lawyer (and poet) is a defence of Elsevier’s practice of having copyright encumber scientific publishing. I tried to read it in the name of fairness. It didn’t go well. The very first sentence is wrong:
It is often said that copyright law is about a balance of interests and communities, creators and users, and ultimately society as a whole.
No. Copyright is not a balance between competing interests; it’s a bargain that society makes. We, the people, give up some rights in exchange for incentivising creative people to make new work, because that new work is of value to society. To quote the US constitution’s helpful clause, copyrights exist “To promote the Progress of Science and useful Arts” — not for authors, but for wider society. And certainly not of publishers who coerce authors to donate copyright!
(To be fair to Seeley, he did hedge by writing “It is often said that copyright law is about a balance”. That is technically true. It is often said; it’s just wrong.)
Well, that’s three paragraphs on the first sentence of Elsevier’s defence of copyright. I suppose I’d better move on.
The STM journal publishing sector is constantly adjusting to find the right balance between researcher needs and the journal business model, as refracted through copyright.
Wrong wrong wrong. We don’t look for a balance between researchers needs (i.e. science) and the journal business model. Journals are there to serve science. That’s what they’re for.
Then we have the quote from Mark Fischer:
I submit that society benefits when the best creative spirits can be full-time creators and not part-timers doing whatever else (other than writing, composing, painting, etc.) they have to do to pay the rent.
This may be true. But it is totally irrelevant to scholarly copyright. That should hardly need pointing out, but here it is for those hard of thinking. Scholars make no money from the copyright in the work they do, because (under the Elsevier model) they hand that copyright over to the publisher. Their living comes in the form of grants and salaries, not royalties.
Ready for the next one?
The alternatives to a copyright-based market for published works and other creative works are based on near-medieval concepts of patronage, government subsidy […]
Woah! Governments subsidising research and publication is “near-medieval”? And there we were thinking it was by far the most widespread model. Silly us. We were all near-medieval all this time.
Someone please tell me this is a joke.
Moving swiftly on …
Loud advocates for “copyright reform” suggest that the copyright industries have too much power […] My comparatively contrarian view is that this ignores the enormous creative efforts and societal benefits that arise from authoring and producing the original creative work in the first place: works that identify and enable key scientific discoveries, medical treatments, profound insights, and emotionally powerful narratives and musical experiences.
Wait, wait. Are we now saying that … uh, the only reason we get scientific discoveries and medical treatment because … er … because of copyright? Is that it? That can’t be it. Can it?
Copyright has no role in enabling this. None.
In fact, it’s worse than that. The only role of copyright in modern scholarly publishing is to prevent societal benefits arising from scientific and medical research.
The article then wanders off into an (admittedly interesting) history of Seeley’s background as a poet, and as a publisher of literary magazines. The conclusion of this section is:
Of course creators and scientists want visibility […] At the very least, they’d like to see some benefit and support from their work. Copyright law is a way of helping make that happen.
This article continues to baffle. The argument, if you want to dignify it with that name, seems to be:
- poets like copyright
- => we copyright other people’s science
- => … profit!
Well, that was incoherent. But never mind: finally we come to part of the article that makes sense:
- There is the “idea-expression” dichotomy — that copyright protects expression but not the fundamental ideas expressed in a copyright work.
This is correct, of course. That shouldn’t be cause for comment, coming from a copyright lawyer, but the point needs to be made because the last time an Elsevier lawyer blogged, she confused plagiarism with copyright violation. So in that respect, this new blog is a step forward.
But then the article takes a sudden left turn:
The question of the appropriateness of copyright, or “authors’ rights,” in the academic field, particularly with respect to research journal articles, is sometimes controversial. In a way quite similar to poets, avant-garde literary writers and, for that matter, legal scholars, research academics do not rely directly on income from their journal article publishing.
Er, wait, what? So you admit that scholarly authors do not benefit from copyright in their articles? We all agree, then, do we? Then … what was the first half of the article supposed to be about?
And in light of this, what on earth are we to make of this:
There is sometimes a simplistic “repugnance” about the core publishing concept that journal publishers request rights from authors and in return sell or license those rights to journal subscribers or article purchasers.
Seeley got that much right! (Apart from the mystifyingly snide use of “simplistic” and the inexplicable scare-quotes.) The question is why he considers this remotely surprising. Why would anyone not find such a system repugnant? (That was a rhetorical question, but here’s the answer anyway: because they make a massive profit from it. That is the only reason.)
Well, we’re into the final stretch. The last paragraph
Some of the criticism of the involvement of commercial publishing and academic research is simply prejudice, in my view;
Yes. Some of us are irrationally prejudiced against a system where, having laboriously created new knowledge, it’s then locked up behind a paywall. It’s like the irrational prejudice some coal-miners have against the idea of the coal they dig up being immediately buried again.
And finally, this:
Some members of the academic community […] base their criticism on idealism.
Isn’t that odd? I have never understood why some people consider “idealism” to be a criticism. I accept it as high praise. People who are not idealists have nothing to base their pragmatism on. They are pragmatic, sure, but to what end?
So what are we left with? What is Seeley’s article actually about? It’s very hard to pick out a coherent thread. If there is one, it seems to be this: copyright is helpful for some artists, so it follows that scholarly authors should donate their copyright to for-profit publishers. That is a consequence that, to my mind, does not follow particularly naturally from the hypothesis.
[Today’s live-blog is brought to you by Yvonne Nobis, science librarian at Cambridge, UK. Thanks, Yvonne! — Mike.]
Session 1 — The Journal Article: is the end in sight?
Slightly late start due to trains – !
Just arrived to hear Aileen Fyfe University of St Andrews saying that something similar to journal articles will be needed for ‘quite some time’.
Steven Hall, IOP.
The article still fulfils its primary role — the registration, dissemination, certification and archiving of scholarly information. The Journal Article still provides a fixed point — and researchers still see the article as a critical part of research — although it is now evolving into something much more fluid.
Steve then outlined some of the initiatives that IOP have implemented. Examples include the development of thesauri — every article is ‘semantically fingerprinted’. No particular claims are made for IOP innovation — some are broad industry initiatives — but demonstrate how the journal article has evolved.
(Personal bias: as a librarian I like the IOP journal and ebook offering!) IOP have worked with RIN on a study on the researcher behaviour of physical sciences — to research the impact of new technology on researchers. Primary conclusion: researchers in the physical sciences are conservative and oddly see the journal article as most important method of communicating research. (This seems at odds with use of arXiv?)
Mike Brady discusses the ‘floribunda’ of the 19th century scholarly publishing environment.
Sally Shuttleworth (Oxford) questions the move from the gentleman scholar to the publishing machinery of the 21st century and wonders if there will be a resurgence due to citizen science?
Tim Smith (CERN) proposes that change is being technologically driven.
Stuart Taylor (Royal Society publishing) agrees with Steve that there is disconnect between reality and outlandish speculations about what should be in place, and the ‘bells and whistles’ that publishers are adding in to the mix that are not used.
Cameron Neylon: what the web gives us the ability to separate content from display — and this gives us a huge opportunity — and many of us in the this room did predict the death of the article several years ago …(This was premature!)
Herman Hauser makes the valid point that it is well nigh impossible for a researcher now to understand the breadth of a whole field.
Ginny Barbour raises the question of incentives (the article still being the accepted de facto standard). The point was also raised that perhaps this meeting should be repeated with an audience 30 years younger…
No panel comment on this point, however I fear what many would say is that this meeting represents the apex of a pyramid, where these discussions have occurred for years in other conferences (for example, the various science online and force meetings) and have driven both innovation (novel publishing models) and the creation of tools.
I asked about (predictably enough) about use of arXiv — slightly surprised at the response to the RIN study.
Steve Hall: ‘science publishers are service providers’ — if scientific communities become clear about what they want, we can provide such services — but coherent thinking needs to underwrite this. Steve also questions the incentives put in place for researchers to publish in certain high impact journals and how this is damaging.
Steve Hall: arXiv won’t allow publishers on their governing bodies –and interestingly librarians (take note!) should be engaging with the storage of the data!
Aileen, in conclusion, questions how did the plurality of modes of communication we had in the 18th and 19th centuries get closed down to the level of purely journals? The issue of learned societies and their relationship with commercial agencies is often a cause for concern…
Session 2 How might scientists communicate in the future?
the role of the speakers is to catalyse discussion amongst ourselves…
Anita de Waard (Elsevier)
350 years ago science was an individual enterprise, although now many large collaborations, much scientific discussion is still on a peer to peer level.
How do we unify the needs of the collective and individual scientists?
We need to create the systems of knowledge management that work for scientists, publishers and librarians.
Quotes John Perry Barlow: ‘Let us endeavour to build systems that allow a kid in Mali who wants to learn about proteomics to not be overwhelmed by the irrelevant and the untrue’ (It would be cruel to mention various issues with the Journal of Proteomics last year…)
Problem is the the paper is the overarching modus operandi. Citations to data are often citations to pictures. We need better ways of citing and connecting knowledge. ‘Papers are stories that persuade with data’, says Anita. She argues we need better ways of citing claims, and constructing chains of evidence that can be traced to their source.
For this we need tools and to build habits of citing evidence into all aspects of our educational system (starting at kindergarten)!
Another problem is data cannot be found or integrated (this to my view is something that the academic community should be tackling, not out-sourcing, which is the way I see this going…)
An understanding needs to evolve that science is a collective endeavour.
Anita is now covering scientific software (‘scientific software sucks’ is the quote attributed to Ben Goldacre yesterday) — it compares unfavourably to Amazon … not sure how true this is?
Anita is very dismissive of scientific software not being adequate — often code is written for a particular purpose. (My view is that this is not something that can easily be commercially outsourced — High energy physics anyone?)
Mark Hahnel, FigShare
(FigShare was built as a way for Mark to curate/publish his own research.)
Mark opens with policies from different funders (at Cambridge we are feeling the effect of these already) for data mandates — especially EPSRC: all digital outputs from funded research now must be made available.
Mark talks around the Open Academic Tidal Wave — sorry not a great link but the only one I can find (thanks Lou Woodley): and we are at level 4 of this.
Mark surveyed publishers about what they see the future of publishing in 2020 — and they replied ‘Version control on papers, data incorporated within the article’, but the technology is there already — and uses the example of F1000 Research.
Mike Brady: It’s as well Imelda Marcos was not a scientist — following on from Anita’s claims that software for buying shoes is more fit for purpose than scientific software!
Herman Hauser: willing to fund things that help with an ‘evidence engine’ to avoid repeats of the MMR fiasco!
David Coloquhan: science is not the same as buying shoes! Refreshingly cynical.
Wendy Hall stresses the importance of linking information — every publisher should have a semantically linked website (and on the science of buying shoes).
Comment from the floor: Getting more data into repositories may not be exciting but is essential. Mark agrees — once the data is there you can do things with it, such as building apps to extract what you need.
Richard Sever (Cold Harbour Press) with a great quote: “The best way to store genomic data is in DNA.”
Mike Taylor: when we discuss how data is associated with papers we must ensure that this is ‘open’, this includes the APIs, to avoid repeating the ‘walled garden of silos’ in which we find ourselves now.
Question of electronic access in the future (Dave Garner) — how do we future-proof science? Very valid — we can’t access material from 1980s floppy disks!
Anita: data is entwined with software and we need to preserve these executable components. Issues returning to citation and data citations and incentives again which has been a pervasive theme over the last couple of days.
Cameron Neylon: we need to move to a situation where we can publish data itself, and this can be an incremental process, not the current binary ‘publish or not publish’ situation (which of course comes back to incentives).
In summary, Mark questions timescales, and Anita wonders how the Royal Society can bring these topics to the world?
Time for lunch, and now over to Matthew Dovey to continue this afternoon (alongside Steven Hall another of my former colleagues)!