Yet more uninformed noodling on the future of scientific publishing and that kind of thing
June 16, 2009
Sorry to keep dumping all these off-topic thoughts on you all, but I got an email from Matt today in which he suggested that there should be some system of giving people credit for particularly insightful blog comments. (This came up for the obvious reason that SV-POW! readers tend to leave unusually brilliant comments, as well as having excellent reading taste and being remarkably good looking.) That led me into the following sequence of thoughts, which I thought were worth blogging — not least in the hope that we can learn something from the comments.
But first, here is that photo of another fused atlas-axis complex that you ordered (seriously, what’s up with these things?):
And now, on with the uninformed noodling:
As things stand at this point, we have a hierarchy of sciency documents. At the top (which we’ll call level 1) come papers. The reputation of papers is largely determined by formal pre-publication reviews (which we will therefore classify as level 2) — and, increasingly, also by blog posts about the paper, which are also level 2. Classic peer-reviews are only ever seen by the editor and the author of the original paper; once they have been absorbed into the paper they’re critiquing, they disappear forever, which is a crying shame. But the other kind of level-2 literature, the blog post, has a life of its own: and so it gets commented on by blog-comments (level 3). Each level gives validity to the level above.
More important, documents at each level also give validity to each other. The most important case is that when one paper favourably discusses another, or refers to its authority, it gives the latter a credibility boost (which is why it’s such a sod that no-one cites any of my papers); similarly, our SV-POW! posts also get a credibility boost when they’re discussed on Tetrapod Zoology or Blog Around the Clock (and I just repaid the compliment by linking back to them).
(At present, all of this is done in a messy qualitative way, with no numbers attached, except occasionally in the case of pre-publication reviews. That’s a shame: if, for example, blog commenters allocated the posts a score out of ten, then we could use some kind of average score as a quality filter: to ameliorate rigging, I’d suggest discarding the highest and lowest 10% of awarded scores, and averaging the remainder.)
Now the problem: blog comments are right at the bottom of the pile: who is going to rate them? I’m certainly not going to spend any time on that.
OK, so suppose we ignore the arbitrary allocation of levels: papers, reviews, blog posts and comments are all just considered as documents, and all can discuss each other. (Clearly reviews will necessarily discuss papers more often the papers discuss blog comments, but that is a convention added to the system I am about to describe, not a precondition for it.) Each document has a reputation, which we will quantify as a single real number. Documents start with some arbitrary small reputation — probably 0.0 or 1.0, and it probably doesn’t much matter what it is. When any document discusses, cites or links to another — whether it’s paper, a review, a blog post or a comment — that linkee’s reputation is boosted by some proportion: 10%, say, of the linker’s reputation. Now of course this change in linkee reputation causes a trickle-down change of 1% in the reputation of the documents that it links to; and 0.1% in the reputation of the documents they link to, and so on. Reputations will change frequently and irregularly, and will be near impossible to calculate accurately, but that’s fine — they should be easy to approximate, and that’s good enough.
In this way, we get a nice solid score that we can use to decide what’s worth reading and what isn’t — the cream will naturally rise to the top. Hiring committees can throw away impact factors, and instead just add up the reputation scores of their candidates’ publications (either in the strict sense of the word, or including blog posts, reviews and/or comments). By the way, one of the positive effects of this would be that people like Darren and Jerry Harris would get some reward from their sterling reviewing efforts.
Sounds awesome? Here’s something even more awesome: we already have that system, more or less. Yes indeed: the reputation propagation algorithm I described is, in general outline, the same thing that Google does in the algorithm that it calls PageRank(tm)(r)(lol)(ymmv). We can — and already do — use Google’s notion of reputation as a guide to finding what’s worth reading, and we can tell that in works well in practice because SV-POW! posts rank so highly :-)
So that’s it! We can all stop worrying, just Google for stuff we’re interested in, and read whatever pops up at the top of the list!
Are you convinced? I hope not, because this idea has at least three huge problems.
1. What counts? (Yes, that again.) Google-ranking works well for blog posts, because they are web pages, and Google can spider web pages. But that leaves out reviews, because they are typically not published at all, let alone as web pages. And it leaves out comments, because they are appended to the end of blog posts rather than being pages in own right, with their own PageRank. And, worst of all, it pretty much leaves out the papers themselves — because there is, in general, no one single web-page which is The Place a particular paper lives. For non-open papers that aren’t hosted on the author’s page or elsewhere, there is no page. In short, reviews are not published, comments are not whole pages and papers are not single pages, so none of them is properly page-rankable.
2. All links count as positive reputation — there are no negative citations. So a document that says “Taylor, Wedel and Naish 2009 was talking a lot of nonsense about sauropod neck posture” would still be a score in our favour, even though it meant the exact opposite. Of course, this is not a new problem: both PageRank and Impact Factors suffer from the same problem, but it doesn’t seem to be a killer for either of them. The only fix for this would be to invite authors (of papers, reviews, blogs and comments) to explicitly score some or all of the other documents they mention — and I doubt people are going to be keen to do that unless the mechanism can be made very non-intrusive.
3. And here’s the killer: we wouldn’t, or shouldn’t, want Google to do this, even if they could overcome problems #1 and #2. Google is a private corporation, and we don’t want to hand over reputation management to any private commercial venture with an obligation to shareholders rather than scientists, and with a proprietary secret algorithm. If you doubt me, consider Thompson’s ownership of the Impact Factor and see where that’s got us. No doubt when Eugene Garfield came up with the idea of the Impact Factor, he was pretty excited about how — at last! — we would have an objective, reliable way to evaluate science. But IF is not run by scientists, it’s run by a corporation. With hilarious results.
I have no idea what the conclusion to all this is. I didn’t have a clear idea where it was headed when I started writing it. But, much in the manner of Dirk Gently when employing his usual method of navigation, I may not have ended up where I intended to, but I’ve arrived somewhere interesting.
Your move: what have I failed to take into account?