Live-blog: the Future of Scholarly Scientific Communication, part 2
May 5, 2015
I’ll try to live-blog the first day of part 2 of the Royal Society’s Future of Scholarly Scientific Communication meeting, as I did for the first day of part 1. We’ll see how it goes.
Session 1: the reproducibility problem
Chair: Alex Halliday, vice-president of the Royal Society
Introduction to reproducibility. What it means, how to achieve it, what role funding organisations and publishers might play.
For an introduction/overview, see #FSSC – The role of openness and publishers in reproducible research.
Michele Dougherty, planetary scientist
It’s very humbling being at this meeting, when it’s so full of people who have done astonishing things. For example, Dougherty discovered an atmosphere around one of Saturn’s moons by an innovative use of magnetic field data. So many awesome people.
Her work is largely to do with very long-term project involving planetary probes, e.g. the Cassini-Huygens probe. It’s going to be interesting to know what can be said about reproducibility of experiments that take decades and cost billions.
“The best science output you can obtain is as a result of collaboration with lots of different teams.”
Application of reproducibility here is about making the data from the probes available to the scientific community — and the general public — so that the result of analysis can be reproduced. So not experimental replication.
Such data often has a proprietary period (essentially an embargo) before its public release, partly because it’s taken 20 years to obtain and the team that did this should get the first crack at it. But it all has to be made publicly available.
Dorothy Bishop, chair of Academy of Medical Sciences group on replicability
The Royal Society is very much not the first to be talking about replicability — these discussions have been going on for years.
About 50% of studies in Bishop’s field are capable of replication. Numbers are even worse in some fields. Replication of drug trials are particularly important, as false result kill people.
Journals cause awful problems with impact-chasing: e.g. high-impact journals will publish sexy-looking autism studies with tiny samples, which no reputable medical journal would publish.
Statistical illiteracy is very widespread. Authors can give the impression of being statistically aware but in a superficial way.
Too much HARKing going on (Hypothesising After Results Known — searching a dataset for anything that looks statistically significant in the shallow p < 0.05 sense.)
“It’s just assumed that people doing research, know what they are doing. Often that’s just not the case.”
… many more criticisms of how the journal system encourages bad research. They’re coming much faster than I can type them. This is a storming talk, I wish the record would be made available.
Employers are also to blame for prioritising expensive research proposals (= large grants) over good ones.
All of this causes non-replicable science.
Lots of great stuff here that I just can’t capture, sorry. Best follow the tweet stream for the fast-moving stuff.
One highlight: Pat Brown thinks it’s not necessarily a problem if lots of statistically underpowered studies are performed, so long as they’re recognised as such. Dorothy Bishop politely but emphatically disagrees: they waste resources, and produce results that are not merely useless but actively wrong and harmful.
David Colhoun comments from the floor: while physical sciences consider “significant results” to be five sigmas (p < 0.000001), biomed is satisfied with slightly less than two sigmas (p < 0.05) which really should be interpreted only as “worth another look”.
Dorothy Bishop on publishing data, and authors’ reluctance to do so: “It should be accepted as a cultural norm that mistakes in data do happen, rather than shaming people who make data open.”
Nothing to report :-)
Session 2: what can be done to improve reproducibility?
Iain Hrynaszkiewicz, head of data, Nature
In an analysis of retractions of papers in PubMed Central, 2/3 were due to fraud and 20% due to error.
Access to methods and data is a prerequisite for replicability.
Pre-registration, sharing of data, reporting guidelines all help.
“Open access is important, but it’s only part of the solution. Openness is a means to an end.”
Hrynaszkiewicz says text-miners are a small minority of researchers. [That is true now, but I and others are confident this will change rapidly as the legal and technical barriers are removed: it has to, since automated reading is the only real solution to the problem of keeping up with an exponentially growing literature. — Ed.]