March 22, 2017
The previous post (Every attempt to manage academia makes it worse) has been a surprise hit, and is now by far the most-read post in this blog’s nearly-ten-year history. It evidently struck a chord with a lot of people, and I’ve been surprised — amazed, really — at how nearly unanimously people have agreed with it, both in the comments here and on Twitter.
But I was brought up short by this tweet from Thomas Koenig:
That is the question, isn’t it? Why do we keep doing this?
I don’t know enough about the history of academia to discuss the specific route we took to the place we now find ourselves in. (If others do, I’d be fascinated to hear.) But I think we can fruitfully speculate on the underlying problem.
Let’s start with the famous true story of the Hanoi rat epidemic of 1902. In a town overrun by rats, the authorities tried to reduce the population by offering a bounty on rat tails. Enterprising members of the populace responded by catching live rats, cutting off their tails to collect the bounty, then releasing the rats to breed, so more tails would be available in future. Some people even took to breeding rats for their tails.
Why did this go wrong? For one very simple reason: because the measure optimised for was not the one that mattered. What the authorities wanted to do was reduce the number of rats in Hanoi. For reasons that we will come to shortly, the proxy that they provided an incentive for was the number of rat tails collected. These are not the same thing — optimising for the latter did not help the former.
The badness of the proxy measure applies in two ways.
First, consider those who caught rats, cut their tails off and released them. They stand as counter-examples to the assumption that harvesting a rat-tail is equivalent to killing the rat. The proxy was bad because it assumed a false equivalence. It was possible to satisfy the proxy without advancing the actual goal.
Second, consider those who bred rats for their tails. They stand as counter-examples to the assumption that killing a rat is equivalent to decreasing the total number of live rats. Worse, if the breeders released their de-tailed captive-bred progeny into the city, their harvests of tails not only didn’t represent any decrease in the feral population, they represented an increase. So the proxy was worse than neutral because satisfying it could actively harm the actual goal.
So far, so analogous to the perverse academic incentives we looked at last time. Where this gets really interesting is when we consider why the Hanoi authorities chose such a terribly counter-productive proxy for their real goal. Recall their object was to reduce the feral rat population. There were two problems with that goal.
First, the feral rat population is hard to measure. It’s so much easier to measure the number of tails people hand in. A metric is seductive if it’s easy to measure. In the same way, it’s appealing to look for your dropped car-keys under the street-lamp, where the light is good, rather than over in the darkness where you dropped them. But it’s equally futile.
Second — and this is crucial — it’s hard to properly reward people for reducing the feral rat population because you can’t tell who has done what. If an upstanding citizen leaves poison in the sewers and kills a thousand rats, there’s no way to know what he has achieved, and to reward him for it. The rat-tail proxy is appealing because it’s easy to reward.
The application of all this to academia is pretty obvious.
First the things we really care about are hard to measure. The reason we do science — or, at least, the reason societies fund science — is to achieve breakthroughs that benefit society. That means important new insights, findings that enable new technology, ways of creating new medicines, and so on. But all these things take time to happen. It’s difficult to look at what a lab is doing now and say “Yes, this will yield valuable results in twenty years”. Yet that may be what is required: trying to evaluate it using a proxy of how many papers it gets into high-IF journals this year will most certainly mitigate against its doing careful work with long-term goals.
Second we have no good way to reward the right individuals or labs. What we as a society care about is the advance of science as a whole. We want to reward the people and groups whose work contributes to the global project of science — but those are not necessarily the people who have found ways to shine under the present system of rewards: publishing lots of papers, shooting for the high-IF journals, skimping on sample-sizes to get spectacular results, searching through big data-sets for whatever correlations they can find, and so on.
In fact, when a scientist who is optimising for what gets rewarded slices up a study into multiple small papers, each with a single sensational result, and shops them around Science and Nature, all they are really doing is breeding rats.
If we want people to stop behaving this way, we need to stop rewarding them for it. (Side-effect: when people are rewarded for bad behaviour, people who behave well get penalised, lose heart, and leave the field. They lose out, and so does society.)
Q. “Well, that’s great, Mike. What do you suggest?”
A. Ah, ha ha, I’d been hoping you wouldn’t bring that up.
No-will be surprised to hear that I don’t have a silver bullet. But I think the place to start is by being very aware of the pitfalls of the kinds of metrics that managers (including us, when wearing certain hats) like to use. Managers want metrics that are easy to calculate, easy to understand, and quick to yield a value. That’s why articles are judged by the impact factor of the journal they appear in: the calculation of the article’s worth is easy (copy the journal’s IF out of Wikipedia); it’s easy to understand (or, at least, it’s easy for people to think they understand what an IF is); and best of all, it’s available immediately. No need for any of that tedious waiting around five years to see how often the article is cited, or waiting ten years to see what impact it has on the development of the field.
Wise managers (and again, that means us when wearing certain hats) will face up to the unwelcome fact that metrics with these desirable properties are almost always worse than useless. Coming up with better metrics, if we’re determined to use metrics at all, is real work and will require an enormous educational effort.
One thing we can usefully do, whenever considering a proposed metric, is actively consider how it can and will be hacked. Black-hat it. Invest a day imagining you are a rational, selfish researcher in a regimen that uses the metric, and plan how you’re going to exploit it to give yourself the best possible score. Now consider whether the course of action you mapped out is one that will benefit the field and society. If not, dump the metric and start again.
Q. “Are you saying we should get rid of metrics completely?”
A. Not yet; but I’m open to the possibility.
Given metrics’ terrible track-record of hackability, I think we’re now at the stage where the null hypothesis should be that any metric will make things worse. There may well be exceptions, but the burden of proof should be on those who want to use them: they must show that they will help, not just assume that they will.
And what if we find that every metric makes things worse? Then the only rational thing to do would be not to use any metrics at all. Some managers will hate this, because their jobs depend on putting numbers into boxes and adding them up. But we’re talking about the progress of research to benefit society, here.
We have to go where the evidence leads. Dammit, Jim, we’re scientists.
March 17, 2017
I’ve been on Twitter since April 2011 — nearly six years. A few weeks ago, for the first time, something I tweeted broke the thousand-retweets barrier. And I am really unhappy about it. For two reasons.
First, it’s not my own content — it’s a screen-shot of Table 1 from Edwards and Roy (2017):
And second, it’s so darned depressing.
The problem is a well-known one, and indeed one we have discussed here before: as soon as you try to measure how well people are doing, they will switch to optimising for whatever you’re measuring, rather than putting their best efforts into actually doing good work.
In fact, this phenomenon is so very well known and understood that it’s been given at least three different names by different people:
- Goodhart’s Law is most succinct: “When a measure becomes a target, it ceases to be a good measure.”
- Campbell’s Law is the most explicit: “The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor.”
- The Cobra Effect refers to the way that measures taken to improve a situation can directly make it worse.
As I say, this is well known. There’s even a term for it in social theory: reflexivity. And yet we persist in doing idiot things that can only possibly have this result:
- Assessing school-teachers on the improvement their kids show in tests between the start and end of the year (which obviously results in their doing all they can depress the start-of-year tests).
- Assessing researchers by the number of their papers (which can only result in slicing into minimal publishable units).
- Assessing them — heaven help us — on the impact factors of the journals their papers appear in (which feeds the brand-name fetish that is crippling scholarly communication).
- Assessing researchers on whether their experiments are “successful”, i.e. whether they find statistically significant results (which inevitably results in p-hacking and HARKing).
What’s the solution, then?
I’ve been reading the excellent blog of economist Tim Harford, for a while. That arose from reading his even more excellent book The Undercover Economist (Harford 2007), which gave me a crash-course in the basics of how economies work, how markets help, how they can go wrong, and much more. I really can’t say enough good things about this book: it’s one of those that I feel everyone should read, because the issues are so important and pervasive, and Harford’s explanations are so clear.
In a recent post, Why central bankers shouldn’t have skin in the game, he makes this point:
The basic principle for any incentive scheme is this: can you measure everything that matters? If you can’t, then high-powered financial incentives will simply produce short-sightedness, narrow-mindedness or outright fraud. If a job is complex, multifaceted and involves subtle trade-offs, the best approach is to hire good people, pay them the going rate and tell them to do the job to the best of their ability.
I think that last part is pretty much how academia used to be run a few decades ago. Now I don’t want to get all misty-eyed and rose-tinted and nostalgic — especially since I wasn’t even involved in academia back then, and don’t know from experience what it was like. But could it be … could it possibly be … that the best way to get good research and publications out of scholars is to hire good people, pay them the going rate and tell them to do the job to the best of their ability?
[Read on to Why do we manage academia so badly?]
- Edwards, Marc A., and Siddhartha Roy. 2017. Academic Research in the 21st Century: Maintaining Scientific Integrity in a Climate of Perverse Incentives and Hypercompetition. Environmental Engineering Science 34(1):51-61.
- Harford, Tim. 2007. The Undercover Economist. Abacus (Little, Brown). 384 pages. [Amazon US, Amazon UK]
Here is a nicely formatted full-page version of the Edwards and Roy table, for you to print out and stick on all the walls of your university. My thanks to David Roberts for preparing it.
My talk on copyright, from the University of Manchester’s “Open Knowledge in Higher Education” course
January 5, 2017
Back in February last year, I had the privilege of giving one of the talks in the University of Manchester’s PGCert course “Open Knowledge in Higher Education“. I took the subject “Should science always be open?”
My plan was to give an extended version of a talk I’d given previously at ESOF 2014. But the sessions before mine raised all sorts of issues about copyright, and its effect on scholarly communication and the progress of science, and so I found myself veering off piste. The first eight and a half minutes are as planned; from there, I go off on an extended tangent. Well. See what you think.
The money quote (starting at 12m10s): “What is copyright? It’s a machine for preventing the creation of wealth.”
October 26, 2016
What’s holding back infrastructure development?
“The real problem, of course, as always, is not the technical one, it’s the social one. How do you persuade people to turn away from the brands that they’ve become comfortable with?
We really are only talking about brands, the value of publishing in, say, a big name journal rather than publishing in a preprint repository. It is nothing to do with the value of the research that gets published. It’s like buying a pair of jeans that are ten times as expensive as the exact same pair of jeans in Marks and Spencer because you want to get the ones that have an expensive label. Now ask why we’re so stupid that we care about the labels.”
Read the full interview here.
August 31, 2016
As explained in careful detail over at Stupid Patent of the Month, Elsevier has applied for, and been granted, a patent for online peer-review. The special sauce that persuaded the US Patent Office that this is a new invention is cascading peer review — an idea so obvious and so well-established that even The Scholarly Kitchen was writing about it as a commonplace in 2010.
Well. What can this mean?
A cynic might think that this is the first step an untrustworthy company would take preparatory to filing a lot of time-wasting and resource-sapping nuisance lawsuits on its smaller, faster-moving competitors. They certainly have previous in the courts: remember that they have brought legal action their own customers as well as threatening Academia.edu and of course trying to take Sci-Hub down.
Elsevier representatives are talking this down: Tom Reller has tweeted “There is no need for concern regarding the patent. It’s simply meant to protect our own proprietary waterfall system from being copied” — which would be fine, had their proprietary waterfall system not been itself copied from the ample prior art. Similarly, Alicia Wise has said on a public mailing list “People appear to be suggesting that we patented online peer review in an attempt to own it. No, we just patented our own novel systems.” Well. Let’s hope.
But Cathy Wojewodzki, on the same list, asked the key question:
I guess our real question is Why did you patent this? What is it you hope to market or control?
We await a meaningful answer.
June 20, 2016
Back in mid-April, when I (Mike) was at the OSI2016 conference, I was involved in the “Moral Dimensions of Open” group. (It was in preparation for this that wrote the Moral Dimensions series of posts here on SV-POW!.)
Like all the other groups, ours was tasked with making a presentation to the plenary session, taking questions and feedback, and presenting a version 2 on the final day. Here’s the title page that I contributed.
Each group was also asked to write a short paper summarising their discussions and conclusions, with all the papers to be published openly. The resulting papers are now available: sixteen of them in all. And among them is Ansolabehere et al. (2016), “The Moral Dimensions of Open”, of which I am one of nine authors. (There were ten authors of the presentation: for some reason, Ryan Merkley is not on the paper.)
As you can imagine in a group that contained open-access advocates, human rights activists, representatives of both old-school and new-wave publishers, agriculturalists and more, consensus was far from unanimous, and it was quite a rocky road to arriving at a form of the paper that we could all live with. In this case, the standard note that was added to all the papers is very appropriate:
This document reflects the combined input of the authors listed here (in alphabetical order by last name) as well as contributions from other OSI2016 delegates. The findings and recommendations expressed herein do not necessarily reflect the opinions of the individual authors listed here, nor their agencies, trustees, officers, or staff.
Is this the moral-dimensions paper I would have written? No, it’s not. Being a nine-way collaboration, it pulls in too many directions to have as clear a through-line as I’d like; and it’s arguably a bit mealy-mouthed in places. But over all, I am pretty happy with it. I think it makes some important points, and makes them reasonably well given the sometimes clumsy prose that you always get when something is written by committee.
Anyway, I think it’s worth a read.
By the way, I’d like to place on record my thanks to Cheryl Ball of West Virginia University, who did the bulk of the heavy lifting in putting together both the presentation and the paper. While everyone in the group contributed ideas and many contributed prose, Cheryl dug in and did the actual work. Really, she deserves to be lead author on this paper — and would be, but for the alphabetical-order convention.
- Ansolabehere, Karina, Cheryl Ball, Medha Devare, Tee Guidotti, Bill Priedhorsky, Wim van der Stelt, Mike Taylor, Susan Veldsman and John Willinsky. 2016. The Moral Dimensions of Open. Open Scholarship Initiative Proceedings 1 (5 pages). doi:10.13021/G8SW2G
In this short series on the moral dimensions of open (particularly open access), we’ve considered why this is important, the argument that zero marginal cost should result in zero price, the idea that the public has a right to read what it paid for, the very high profit margins of scholarly publishers, and the crucial observation that science advances best and fastest when we can build on each other’s work with minimal friction. I’d like to bring the series to a close by asking this question: if we want change, who is responsible for bringing it about?
Often, those most committed to open-access ideals are students and early-career researchers. But we may feel that those just starting out on their careers are the ones with most to lose (or with the least to gain) if they make pro-open stands such as only publishing their work in open-access journals, or agitating for change at their institutions.
Perhaps the responsibility lies with those who have already acquired positions in academia? There are two problems with that. One is that even an academic who has a job wants to present the best possible case for promotion — and, when it’s available, for tenure. The other is that even those who are fully secure and happy in their posts do much of their work in collaboration with Ph.D students and postdocs, and may feel that they owe it to those younger collaborators not to make their paths more difficult by insisting on open access.
Perhaps, then, the responsibility for change lies with senior academics who hold influential administrative roles, having graduated past the point of doing their own research? There are the people with the most power to bring about change, and with the least likelihood of losing out. Yet these people earned their roles by excelling under the old system of paywalled papers and journal prestige as a surrogate for evaluating quality. Is it reasonable to expect these people to turn against the very system that gave them such success?
And we can hardly expect the turkeys who work for legacy publishers to vote for Christmas.
It turns out that everyone, no matter what their career stage or what their role in the world of scholarly communication, has a legitimate reason to say “No, it shouldn’t be my responsibility”.
And that being so, there is only one possible answer to the question “Who should take responsibility?” That answer is, “I should”. Whoever I am.
From my own unique position on the fringes of academia, I take responsibility to do what little I can to bring about the changes that the world needs in how science is communicated. From his position as a postdoc, Jon Tennant does what he can. From her first academic job, Erin McKiernan does what she can. From his relatively secure academic post, Matt Wedel does what he can. From his position running a highly visible and successful lab, Mike Eisen does what he can. And in his powerful role as Rector of the University of Liège, Bernard Rentier does what he can.
It’s simply no use any one of us shrugging and saying “What can I do?” At the same time, it’s also true that, for most of us, what we can do is not very much. But the crucial truth is that by each of us doing what we can, we have done great work over the last decade in pushing towards the world we now live in: where open access is no longer seen as a fringe concern of naive idealists, but is the model used by the world’s biggest and most cited academic journal, where it’s required by 500 university policies and national policies in the USA, UK and many other countries, and where I am proud to say that my own discipline of vertebrate palaeontology now seems to happen primarily in open-access journals.
So we can give ourselves a pat on the back. Go right ahead, do it now — I’ll wait.
But there is an enormous amount still to do. Gold open access is absurdly overpriced. Green open access remains subject to delays, deliberately imposed by reprehensible embargoes. The obsession with journal rank continues. Open data policies remain rare, and are not well enforced. Barrier-based publishing continues to dominate by volume of published papers. Text and data mining initiatives are repeatedly stymied by publisher who bar access even to subscribers. Much of what is published as “open access” is under restrictive licences that pointlessly prohibit many ways of using the work. And there are myriad other related issues still to be resolved, such as the wastefulness of traditional pre-publication peer-review.
How can we fix all these problems?
The same same way we got to where we are now with open access. By each one of us doing what we can to advance sane, efficient, inexpensive, moral means of scientific communication in whatever role we find ourselves. No one of us can fix this. But every one of us can make a contribution.
This blog is nine years old. Since Matt and I are both still enjoying it, there’s no reason think it won’t still be going in another nine years. Strange as it is to imagine SV-POW! in 2025, I hope I can look forward to writing then in an environment where scholarly paywalls are seen as anachronistic and laughable, where publication is faster and more transparent, where data is routinely re-used, and where researchers are evaluated according to the quality of their work, not according to the brand-name they attach to it.
 By the way, I might note that the OA advocates I’ve known as students all seem to have gone on to good postdocs, and the OA advocates I’ve known as postdocs all seem to have gone on to find jobs in academia. I’m not sure what to make of that observation, but I’ll just leave it here.
 It’s certainly true that the most useful descriptive papers are now always in OA journals, where there are no arbitrary limits on length or number of illustrations, or colour fees.