The Less Wrong Metric (LWM): towards a not wholly inadequate way of quantifying the value of research

January 26, 2016

I said last time that my new paper on Better ways to evaluate research and researchers proposes a family of Less Wrong Metrics, or LWMs for short, which I think would at least be an improvement on the present ubiquitous use of impact factors and H-indexes.

What is an LWM? Let me quote the paper:

The Altmetrics Manifesto envisages no single replacement for any of the metrics presently in use, but instead a palette of different metrics laid out together. Administrators are invited to consider all of them in concert. For example, in evaluating a researcher for tenure, one might consider H-index alongside other metrics such as number of trials registered, number of manuscripts handled as an editor, number of peer-reviews submitted, total hit-count of posts on academic blogs, number of Twitter followers and Facebook friends, invited conference presentations, and potentially many other dimensions.

In practice, it may be inevitable that overworked administrators will seek the simplicity of a single metric that summarises all of these.

This is a key problem of the world we actually live in. We often bemoan that fact that people evaluating research will apparently do almost anything than actually read the research. (To paraphrase Dave Barry, these are important, busy people who can’t afford to fritter away their time in competently and diligently doing their job.) There may be good reasons for this; there may only be bad reasons. But what we know for sure is that, for good reasons or bad, administrators often do want a single number. They want it so badly that they will seize on the first number that comes their way, even if it’s as horribly flawed as an impact factor or an H-index.

What to do? There are two options. One is the change the way these overworked administrators function, to force them to read papers and consider a broad range of metrics — in other words, to change human nature. Yeah, it might work. But it’s not where the smart money is.

So perhaps the way to go is to give these people a better single number. A less wrong metric. An LWM.

Here’s what I propose in the paper.

In practice, it may be inevitable that overworked administrators will seek the simplicity of a single metric that summarises all of these. Given a range of metrics x1, x2xn, there will be a temptation to simply add them all up to yield a “super-metric”, x1 + x2 + … + xn. Such a simply derived value will certainly be misleading: no-one would want a candidate with 5,000 Twitter followers and no publications to appear a hundred times stronger than one with an H-index of 50 and no Twitter account.

A first step towards refinement, then, would weight each of the individual metrics using a set of constant parameters k1, k2kn to be determined by judgement and experiment. This yields another metric, k1·x1 + k2·x2 + … + kn·xn. It allows the down-weighting of less important metrics and the up-weighting of more important ones.

However, even with well-chosen ki parameters, this better metric has problems. Is it really a hundred times as good to have 10,000 Twitter followers than 100? Perhaps we might decide that it’s only ten times as good – that the value of a Twitter following scales with the square root of the count. Conversely, in some contexts at least, an H-index of 40 might be more than twice as good as one of 20. In a search for a candidate for a senior role, one might decide that the value of an H-index scales with the square of the value; or perhaps it scales somewhere between linearly and quadratically – with H-index1.5, say. So for full generality, the calculation of the “Less Wrong Metric”, or LWM for short, would be configured by two sets of parameters: factors k1, k2kn, and exponents e1, e2en. Then the formula would be:

LWM = k1·x1e1 + k2·x2e2 + … + kn·xnen

So that’s the idea of the LWM — and you can see now why I refer to this as a family of metrics. Given n metrics that you’re interested in, you pick 2n parameters to combine them with, and get a number that to some degree measures what you care about.

(How do you choose your 2n parameters? That’s the subject of the next post. Or, as before, you can skip ahead and read the paper.)

References

Advertisements

14 Responses to “The Less Wrong Metric (LWM): towards a not wholly inadequate way of quantifying the value of research”


  1. […] Next time I’ll talk about the LWM and how to calculate it. Those of you who are impatient might want to read the actual paper first! […]

  2. protohedgehog Says:

    Reblogged this on Green Tea and Velociraptors and commented:
    Mike Taylor on the incredibly important topic of research and researcher evaluation.


  3. this might be of some value, Also I am working on something like this, but it is data driven, using Principal Component Analysis to see what the dataset says we should weight and which variables are important :)

    http://blogs.lse.ac.uk/impactofsocialsciences/2015/10/08/we-need-informative-metrics-how-to-make-metrics-better/

  4. Mike Taylor Says:

    Sounds like you have read ahead to the “Choosing the parameters for the Less Wrong Metric” section of my paper … the bit I’ve not blogged about yet :-)

    Fantastic that someone is already working on implementing ideas similar to the ones I outline here. I choose to interpret this convergence as an example of great minds thinking alike. Yeah, that’s it.

  5. StupendousMan Says:

    (This was also posted as a comment to your previous entry, but it looks like the discussion has moved here. Apologies for the duplication).

    Mike,

    I’m with you on the inappropriateness of the current methods for judging the worthiness of publications. They should be replaced. The only points in their favor are that they are based on quantitative items that are easily measured by machines (citation count, for example).

    Unfortunately, the new criteria you propose, in the yellow box on page 5 of your article, raise a new set of problems. They are subjective, requiring a human to read the paper and make a judgement. “How significant is this result?” “How clearly is this written?” In order to use these metrics, the community will have to identify a group of experts who will read every paper and score them in these categories. That’s (number one) a lot of work for no pay and (number two) dependent on the whims of each reviewer.

    I guess one might claim that peer reviewers are _already_ reading every paper which makes it into the refereed literature, so if we could just get those reviewers to fill out a scoresheet in addition to writing their reports, and then share all those scores, we would have the required data. Is that your idea?

    Not trying to discourage you and your quest, just trying to figure out how it might be put into practice.

  6. Mike Taylor Says:

    Well, Stupendous Man, the LWM is not so much a metric as a schema for generating a summary metric, based on whatever data you’re able to obtain. Some may be historical, some may be predictive, some may be the result of expert judgement. (Clarity of writing could even be assessed by computer programmes.)

    You could make an LWM using only what you term “objective” measures, But I think that’s a bit misleading, because when you cound, for example, how many times a paper has been cited, you’re counting how many times another author has made the subjective decision to cite this one. In other words, more of our metrics are ultimately subjective than we’re willing to admit to ourselves.

    Anyway, don’t take the yellow boxes too seriously: they are lists of suggestions, not a cast-iron definition of the LWM.


  7. Have you got a link for the wonderful Dave Barry quotation?
    “these are important, busy people who can’t afford to fritter away their time in competently and diligently doing their job”

  8. Mike Taylor Says:

    It’s from his 1986 book Claw Your Way to the Top: How to Become the Head of a Major Corporation in Roughly a Week. On page 33:

    This procedure [of career advancement] is all well and good for most people. But you are not “most people”. You are a highly motivated individual who wants to be on the fast track, and you cannot afford to fritter away valuable time working diligently and competently at your job.


  9. […] remember that in the last installment (before Matt got distrated and wrote about archosaur urine), I proposed a general schema for […]


  10. The problem with this proposal is, of course, that it can give any answer you want, by arbitrary choice from the infinitude of possible values of the weights.

  11. Mike Taylor Says:

    Indeed, David; hence the next post, which discusses how to choose the parameters.


  12. […] Mike introduced his new paper and described the scope and importance of the problem. Then in the next post, he introduced the idea of the LWM, or Less Wrong Metric, and the basic mathemetical framework for […]

  13. Pandelis Says:

    Hello Mike, in 2005 we proposed a method to use a neuro-fuzzy system to calculate a single internationality index out of several related parameters. The problem of research quality, at least as you approach it in this post, is very similar. Perhaps you find the paper interesting: https://zenodo.org/record/45541


  14. […] and the introduction of LWM (Less Wrong Metrics) by Mike Taylor. You can find the posts here, here, here, and […]


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: