The R2R debate, part 1: opening statement in support

February 27, 2020

This Monday and Tuesday, I was at the R2R (Researcher to Reader) conference at BMA House in London. It’s the first time I’ve been to this, and I was there at the invitation of my old sparring partner Rick Anderson, who was organizing this year’s debate, on the proposition “The venue of its publication tells us nothing useful about the quality of a paper”.

I was one half of the team arguing in favour of the proposition, along with Toby Green, currently managing director at Coherent Digital and prevously head of publishing at the OECD for twenty years. Our opponents were Pippa Smart, publishing consultant and editor of Learned Publishing; and Niall Boyce, editor of The Lancet Psychiatry.

I’m going to blog three of the four statements that were made. (The fourth, that of Niall Boyce, is not available, as he spoke from handwritten notes.) I’ll finish this series with a fourth post summarising how the debate went, and discussing what I now think about the proposition.

But now, here is the opening statement for the proposition, co-written by Toby and me, and delivered by him.

The backs of the heads of the four R2R debaters as we watch the initial polling on the proposition. From left to right: me, Toby, Pippa, Niall.


What is the most significant piece of published research in recent history? One strong candidate is a paper called “Ileal-lymphoid-nodular hyperplasia, non-specific colitis, and pervasive developmental disorder in children” published in 1998. It was written by Andrew Wakefield et al., and postulated a link between the MMR vaccine and autism. This article became the launching point for the anti-vax movement, which has resulted in (among other things) 142,000 deaths from measles in 2019 alone. It has also contributed to the general decline of trust in expertise and the rise of fake news.

This article is now recognised as “not just poor science, [but] outright fraud” (BMJ). It was eventually retracted — but it did take its venue of publication 12 years to do so. Where did it appear? In The Lancet, one of the world’s most established and prestigious medical journals, its prestige quantified by a stellar Impact Factor of 59.1.

How could such a terrible paper be published by such a respected journal? Because the venue of its publication tells us nothing useful about the quality of a paper.

Retractions from prestigious venues are not restricted to rogues like Wakefield. Last month, Nobel Prize winner Frances Arnold said she was “bummed” to have to retract her 2019 paper on enzymatic synthesis of beta-lactams because the results were not reproducible. “Careful examination of the first author’s lab notebook then revealed missing contemporaneous entries and raw data for key experiments.” she explained. I.e. “oops, we prepared the paper sloppily, sooorry!”

Prof. Arnold is the first woman to be elected to all three National Academies in the USA and has been lauded by institutions as diverse as the White House, BBC and the Vatican. She even appeared as herself in the TV series, Big Bang Theory. She received widespread praise for being so open about having to retract this work — yet what does it say of the paper’s venue of publication, Science? Plainly the quality of this paper was not in the least assured by its venue of publication. Or to put it another way, the venue of its publication tells us nothing useful about the quality of a paper.

If we’re going to talk about high- and low-prestige venues, we’ll need a ranking system of some sort. The obvious ranking system is the Impact Factor — which, as Clarivate says “can be used to provide a gross approximation of the prestige of journals”. Love it or hate it, the IF has become ubiquitous, and we will reluctantly use it here as a proxy for journal prestige.

So, then: what does “quality” really mean for a research paper? And how does it relate to journal prestige?

One answer would be that a paper’s quality is to do with its methodological soundness: adherence to best practices that make its findings reliable and reproducible. One important aspect of this is statistical power: are enough observations made, and are the correlations significant enough and strong enough for the results to carry weight? We would hope that all reputable journals would consider this crucially important. Yet Brembs et al. (2013) found no association between statistical power and journal impact factor. So it seems the venue of its publication tells us nothing useful about the quality of a paper.

Or perhaps we can define “quality” operationally, something like how frequently a paper is cited — more being good, less being less good, right?. Astonishingly, given that Impact Factor is derived from citation counts, Lozano et al. (2012) showed that citation count of an individual paper is correlated only very weakly with the Impact Factor of the journal it’s published in — and that correlation has been growing yet weaker since 1990, as the rise of the WWW has made discovery of papers easier irrespective of their venue. In other words, the venue of its publication tells us nothing useful about the quality of a paper.

We might at this point ask ourselves whether there is any measurable aspect of individual papers that correlates strongly with the Impact Factor of the journal they appear in. There is: Fang et al. (2012) showed that Impact Factor has a highly significant correlation with the number of retractions for fraud or suspected fraud. Wakefield’s paper has been cited 3336 times — did the Lancet know what it was doing by delaying this paper’s retraction for so long?[1] So maybe the venue of its publication does tell us something about the quality of a paper!

Imagine if we asked 1000 random scholars to rank journals on an “degree of excellence” scale. Science and The Lancet would, I’m sure you’ll agree — like Liverpool’s football team or that one from the “great state of Kansas” recently celebrated by Trump — be placed in the journal Premier League. Yet the evidence shows — both from anecdote and hard data — that papers published in these venues are at least as vulnerable to error, poor experimental design and even outright fraud as those in less exalted venues.

But let’s look beyond journals — perhaps we’ll find a link between quality and venue elsewhere.

I’d like to tell you two stories about another venue of publication, this time, the World Bank.

In 2016, the Bill & Melinda Gates Foundation pledged $5BN to fight AIDS in Africa. Why? Well, it was all down to someone at the World Bank having the bright idea to take a copy of their latest report on AIDS in Africa to Seattle and pitch the findings and recommendations directly to Mr Gates. I often tell this story as an example of impact. I think we can agree that the quality of this report must have been pretty high. After all, it unlocked $5BN for a good cause. But, of course, you’re thinking — D’oh! It’s a World Bank report, it must be high-quality. Really?

Consider also this story: in 2014, headlines like this lit up around the world: “Literally a Third of World Bank Policy Reports Have Never, Ever Been Read Online, By Anyone” (Slate) and “World Bank learns most PDFs it produces go unread” (Sydney Morning Herald). These headlines were triggered by a working paper, written by two economists from the World Bank and published on its website. The punchline? They were wrong, the paper was very wrong. Like Prof. Arnold’s paper they were “missing contemporaneous entries and raw data”, in this case data from the World Bank’s official repository. They’d pulled the data from an old repository. If they had also used data from the Bank’s new repository they’d have found that every Bank report, however niche, had been downloaded many times. How do I know? Because I called the one guy who would know the truth, the Bank’s Publisher, Carlos Rossel, and once he’d calmed down, he told me.

So, we have two reports from the same venue: one plainly exhibiting a degree of excellence, the other painfully embarrassing (and, by the way, it still hasn’t been retracted).

Now, I bet you’re thinking, the latter is a working paper, therefore it hasn’t been peer-reviewed and so it doesn’t count. Well, the Aids in Africa report wasn’t “peer reviewed” either — in the sense we all understand — but that didn’t stop Gates reaching for his Foundation’s wallet. What about all the preprints being posted on BiorXiv and elsewhere about the Coronavirus: do they “not count”? This reminds me of a lovely headline when Cern’s paper on the discovery of the Higgs Boson finally made it into a journal some months after the results had been revealed at a packed seminar, and weeks after the paper had been posted on arXiv: “Higgs boson discovery passes peer review, becomes actual science”. Quite apart from the irony expressed by the headline writer, here’s a puzzler for you. Was the quality of this paper assured by finally being published in a journal (with an impact factor one-tenth of Science’s), or when it was posted in arXiv, or when it was presented at a seminar? Which venue assured the quality of this work?

Of course, none of them did because the venue of its publication tells us nothing about the quality of the paper. The quality is inherent in the paper itself, not in the venue where it is made public.

Wakefield paper’s lack of quality was also inherent in the paper itself and that it was published in The Lancet (and is still available on more than seventy websites) did not mean it was high quality. Or to put it another way, the venue of its publication tells us nothing useful about the quality of a paper.

So what are different venues good for? Today’s scholarly publishing system is still essentially the same as the one that Oldenburg et al started in the 17th Century. This system evolved in an environment when publishing costs were significant and grew with increased dissemination (increased demand meant higher print and delivery costs). This meant that editors had to make choices to keep costs under control — to select what to publish and what to reject. The selection criteria varied: some used geography to segment the market (The Chinese Journal of X, The European Journal of Y); some set up societies (Operational Research Society Journal) and others segmented the market by discipline (The International Journal of Neurology). These were genuinely useful distinctions to make, helping guide authors, readers and librarians to solutions for their authoring, reading and archiving needs.

Most journals pretend to use quality as a criterion to select within their niche — but isn’t it funny that there isn’t a Quality Journal of Chemistry or a Higher-Quality Journal of Physics? The real reasons for selection and rejection are of course to do with building brands and meeting business targets in terms of the number of pages published. If quality was the overarching criteria, why, like the wine harvest, don’t journals fluctuate in output each year? Down when there’s a poor-season and up when the sun shines?

If quality was the principle reason for acceptance and rejection, why is it absent from the list of most common reasons for rejection? According to Editage one of the most common reasons is because the paper didn’t fit the aims and scope of the journal. Not because the paper is of poor quality. The current publishing process isn’t a system for weeding out weak papers from prestige journals, leaving them with only the best. It’s a system for sorting stuff into “houses” which is as opaque, unaccountable and random as the Sorting Hat which confronted Harry Potter at Hogwarts. This paper to the Journal of Hufflepuff; that one to the Journal of Slytherin!

So the venue of its publication can tell us useful things about a paper: its geographical origin, its field of study, the society that endorses it. The one thing it can’t tell us is anything useful about the quality of a paper.


[1] We regret this phrasing. We asked “did the Lancet know what it was doing” in the usual colloquial sense of implying a lack of competence (“he doesn’t know what he’s doing”); but as Niall Boyce rightly pointed out, it can be read as snidely implying that The Lancet knew exactly what it was doing, and deliberately delayed the retraction in order to accumulate more citations. For avoidance of doubt, that is not what we meant; we apologise for not having written more clearly.


We were of course not able to give references during the debate. But since our statement included several citations, we can remedy that deficiency here.

6 Responses to “The R2R debate, part 1: opening statement in support”

  1. […] I told you all about the Researcher to Reader (R2R) conference and its debate on the proposition “The venue of its […]

  2. Fair Miles Says:

    Oh, this is so nice to read! Yeah, yeah: it’s the warmth and comfort of my information bubble… I know. I don’t care. But it is so nicely argued and structured, having carefully selected those punches just barely over the belt and repeatedly stating the proposed statement as mantra… 🤭
    I am sure it was pleasurable to listen/read not only for those sharing the proposition beforehand, like me, but also for everyone not against enough who struggles with words to express their thoughts and feelings. Great!

  3. Mike Taylor Says:

    Thank you! Writing this was a satisfying exercise in collaboration, and the repeated use of the (admittedly not very catchy) catchphrase emerged only in late in the process.

    I hope you will similarly enjoy my shorter response in Part 3!

  4. […] its publication tells us nothing useful about the quality of a paper”. I’ve already posted Toby Green’s opening statement for the proposition and Pippa Smart’s opening statement against […]

  5. […] been a while, but to be fair the world has caught fire since I first started posting about the Research to Reader conference. Stay safe, folks. Don’t meet people. Stay indoors; […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: