Yet more uninformed noodling on the future of scientific publishing and that kind of thing

June 16, 2009

Sorry to keep dumping all these off-topic thoughts on you all, but I got an email from Matt today in which he suggested that there should be some system of giving people credit for particularly insightful blog comments.  (This came up for the obvious reason that SV-POW! readers tend to leave unusually brilliant comments, as well as having excellent reading taste and being remarkably good looking.)  That led me into the following sequence of thoughts, which I thought were worth blogging — not least in the hope that we can learn something from the comments.

But first, here is that photo of another fused atlas-axis complex that you ordered (seriously, what’s up with these things?):

Camarasaurus grandis YPM 1905, fused atlas and axis in right lateral view

Camarasaurus grandis YPM 1905, fused atlas and axis in right lateral view

And now, on with the uninformed noodling:

As things stand at this point, we have a hierarchy of sciency documents. At the top (which we’ll call level 1) come papers. The reputation of papers is largely determined by formal pre-publication reviews (which we will therefore classify as level 2) — and, increasingly, also by blog posts about the paper, which are also level 2. Classic peer-reviews are only ever seen by the editor and the author of the original paper; once they have been absorbed into the paper they’re critiquing, they disappear forever, which is a crying shame. But the other kind of level-2 literature, the blog post, has a life of its own: and so it gets commented on by blog-comments (level 3). Each level gives validity to the level above.

More important, documents at each level also give validity to each other. The most important case is that when one paper favourably discusses another, or refers to its authority, it gives the latter a credibility boost (which is why it’s such a sod that no-one cites any of my papers); similarly, our SV-POW! posts also get a credibility boost when they’re discussed on Tetrapod Zoology or Blog Around the Clock (and I just repaid the compliment by linking back to them).

(At present, all of this is done in a messy qualitative way, with no numbers attached, except occasionally in the case of pre-publication reviews. That’s a shame: if, for example, blog commenters allocated the posts a score out of ten, then we could use some kind of average score as a quality filter: to ameliorate rigging, I’d suggest discarding the highest and lowest 10% of awarded scores, and averaging the remainder.)

Now the problem: blog comments are right at the bottom of the pile: who is going to rate them? I’m certainly not going to spend any time on that.

OK, so suppose we ignore the arbitrary allocation of levels: papers, reviews, blog posts and comments are all just considered as documents, and all can discuss each other. (Clearly reviews will necessarily discuss papers more often the papers discuss blog comments, but that is a convention added to the system I am about to describe, not a precondition for it.) Each document has a reputation, which we will quantify as a single real number. Documents start with some arbitrary small reputation — probably 0.0 or 1.0, and it probably doesn’t much matter what it is. When any document discusses, cites or links to another — whether it’s paper, a review, a blog post or a comment — that linkee’s reputation is boosted by some proportion: 10%, say, of the linker’s reputation. Now of course this change in linkee reputation causes a trickle-down change of 1% in the reputation of the documents that it links to; and 0.1% in the reputation of the documents they link to, and so on. Reputations will change frequently and irregularly, and will be near impossible to calculate accurately, but that’s fine — they should be easy to approximate, and that’s good enough.

In this way, we get a nice solid score that we can use to decide what’s worth reading and what isn’t — the cream will naturally rise to the top. Hiring committees can throw away impact factors, and instead just add up the reputation scores of their candidates’ publications (either in the strict sense of the word, or including blog posts, reviews and/or comments). By the way, one of the positive effects of this would be that people like Darren and Jerry Harris would get some reward from their sterling reviewing efforts.

Sounds awesome? Here’s something even more awesome: we already have that system, more or less. Yes indeed: the reputation propagation algorithm I described is, in general outline, the same thing that Google does in the algorithm that it calls PageRank(tm)(r)(lol)(ymmv).  We can — and already do — use Google’s notion of reputation as a guide to finding what’s worth reading, and we can tell that in works well in practice because SV-POW! posts rank so highly :-)

So that’s it! We can all stop worrying, just Google for stuff we’re interested in, and read whatever pops up at the top of the list!

Are you convinced? I hope not, because this idea has at least three huge problems.

1. What counts? (Yes, that again.) Google-ranking works well for blog posts, because they are web pages, and Google can spider web pages. But that leaves out reviews, because they are typically not published at all, let alone as web pages. And it leaves out comments, because they are appended to the end of blog posts rather than being pages in own right, with their own PageRank. And, worst of all, it pretty much leaves out the papers themselves — because there is, in general, no one single web-page which is The Place a particular paper lives. For non-open papers that aren’t hosted on the author’s page or elsewhere, there is no page.  In short, reviews are not published, comments are not whole pages and papers are not single pages, so none of them is properly page-rankable.

2. All links count as positive reputation — there are no negative citations. So a document that saysTaylor, Wedel and Naish 2009 was talking a lot of nonsense about sauropod neck posture” would still be a score in our favour, even though it meant the exact opposite. Of course, this is not a new problem: both PageRank and Impact Factors suffer from the same problem, but it doesn’t seem to be a killer for either of them. The only fix for this would be to invite authors (of papers, reviews, blogs and comments) to explicitly score some or all of the other documents they mention — and I doubt people are going to be keen to do that unless the mechanism can be made very non-intrusive.

3. And here’s the killer: we wouldn’t, or shouldn’t, want Google to do this, even if they could overcome problems #1 and #2.  Google is a private corporation, and we don’t want to hand over reputation management to any private commercial venture with an obligation to shareholders rather than scientists, and with a proprietary secret algorithm. If you doubt me, consider Thompson’s ownership of the Impact Factor and see where that’s got us. No doubt when Eugene Garfield came up with the idea of the Impact Factor, he was pretty excited about how — at last! — we would have an objective, reliable way to evaluate science. But IF is not run by scientists, it’s run by a corporation.  With hilarious results.

I have no idea what the conclusion to all this is. I didn’t have a clear idea where it was headed when I started writing it. But, much in the manner of Dirk Gently when employing his usual method of navigation, I may not have ended up where I intended to, but I’ve arrived somewhere interesting.

Your move: what have I failed to take into account?

Advertisements

31 Responses to “Yet more uninformed noodling on the future of scientific publishing and that kind of thing”

  1. Matt Wedel Says:

    So that’s it! We can all stop worrying, just Google for stuff we’re interested in, and read whatever pops up at the top of the list!

    Are you convinced?

    Nope, and not for any of the reasons you listed, although they are all good. IMHO the killer is not that Google is a private corporation. It’s that nobody except _possibly_ Google would have the ability to impose this system on the web, and without some kind of global imposition a lot of people just won’t play. Or they’ll try to game the system. Even if everyone is honest, which we know better than to expect, some people will opt in and others will opt out (in the more realistic scenario where global imposition of the scheme is impossible).

    So we’ll have substituted one system of “respectable” insiders (who are respectable not because of where they publish so much as the fact that they participate in a system that makes “respectability” measurable) with unregulated outsiders. It’s a digital Maginot Line.

  2. Vertebrat Says:

    Is that Buffy the Brachiosaur Slayer? :)

    Reddit and its derivatives have up- and down-voting on comments, which provides an easy, unobtrusive way for readers to give feedback. I wish that were available on blogging platforms too.

    As for the problem that critical citations count positively: it seems to me that this should be a problem but isn’t in practice, because negative cites still attest to the importance of the work cited. Papers often cite work they disagree with, but they don’t usually cite uninteresting work they disagree with. Channels with lower barriers to publication will need to sort out the different kinds of cite in some other way. (Actually, I wish they were distinguished in papers, too – references are much more useful when you know which sources are recommended reading, which are cited only for credit, and which are cited negatively. Some old papers have annotated reference lists, but this useful custom seems to have disappeared as reference formats became formalized.)

    By the way, are Camarasaurus verts actually diagnosable to species, as the caption suggests?

  3. Matt Wedel Says:

    Long as I’m here, I should point out that numerical approaches to reputation are not without their critics, completely apart from the counterarguments listed in the post.

    OTOH, I don’t have any better ideas. Possibly because I’m not trying. Call it naive faith, but I think reputation will work itself out without a technical fix. When you think of a particular worker today, do you instantly recall their average IF or their H score, like you might a pro athlete’s stats? Or do you have a unquantifiable but nevertheless fairly accurate idea of “oh, yeah, she does good work” or “that guy is a frikkin’ tool”? Numeric reputation proxies are for bean-counters and always have been, and they suck, and they probably always will.

  4. Andy Says:

    Maybe the one small benefit of numerical measures of “impact” and so forth is that it can be quite useful for those just starting out in the field (especially in absence of good guidance or advice) or picking up from another field. After all, I can think back to some pretty awful books and papers (and DML posts) that I naively thought were hot science. . .

    Of course, any tools such as impact factor will be misused by “tools.”

  5. DDeden Says:

    yo doods, when i’m look’n fer reeeally big tetrapods, i just google “awesome”, it sends me right here to svpow every time! hakuna matata!

  6. Matt Wedel Says:

    Now I’m starting to feel the same despair–of keeping up with all the good stuff–that got us mentioned at Bioephemera last week.

    The idea of using Google’s PageRank algorithm on citations has already been the subject of a paper–on arXiv, appropriately enough.

    Some of the discussion points here and in the previous posts about “sub-paper” levels of publication are also raised in “Micropublication in Chemistry”, which is short and well worth reading.

    I found these thanks to Cameron Neylon, who is probably having great fun watching us blunder through stuff the rest of the world has been talking about for months.

  7. Mike Taylor Says:

    Matt worries that “we’ll have substituted one system of “respectable” insiders (who are respectable not because of where they publish so much as the fact that they participate in a system that makes “respectability” measurable) with unregulated outsiders.”

    And this would be a problem why?

    At the moment, we have “insiders” and “outsiders”. As I know as well as anyone, an outsider who cares enough can become an insider; but there are plenty of outsiders who, instead of sucking it up, writing manuscripts and submitting them to respected journals for review, prefer to invest their time in bitching and moaning about how there an Ivory-Tower Conspiracy that’s out to prevent their ideas being heard. (I know because plenty of these people for some reason like to send their complaints to me.)

    If instead being an “insider” is a matter of taking the trouble to participate in some kind of numeric reputation system, then being an “outsider” will be obviously and visibly a choice — and not one that anyone can reasonable whine about. Suits me.

  8. Nick Gardner Says:

    “but there are plenty of oursiders who, instead of sucking it up, writing manuscripts and submitting them to respected journals for review, prefer to invest their time in bitching and moaning about how there an Ivory-Tower Conspiracy that’s out to prevent their ideas being heard.”

    You know, I’m going to be honest here. All there really is to say to them is TOO BAD. If you aren’t publishing your ideas and you’re just sitting on them and whining about how no one believes you, then you aren’t doing science.

  9. Andy Says:

    For the comment referenced above, I’m going to have to side with Nick. I hear lots about the Ivory Tower Conspiracy from “outsiders” (non-bird/dino supporters, pterosaur theorists, commercial collectors, etc.). . .yet I also see that many of these folks *can* and *do* get their stuff published (and cited) with the appropriate persistence (and in some cases it’s just a matter of finding a willing journal – refer to Minotaurosaurus, for instance, or any number of papers and books by Feduccia, Rubin, and colleagues). And oftentimes, stuff is rejected from publication because it’s just not that good – e.g., a poor understanding of the literature, grossly inaccurate interpretations of morphology, unethical conduct, etc.

    To maybe redirect this a little, what are the characteristics of an outsider and an insider? What distinguishes the two? Getting people to listen to them? Foreknowledge of cool discoveries? Ease in publication or grantmanship? Ease in getting a platform talk at SVP? Quality of research?

  10. Mike Taylor Says:

    If an “insider” is defined as “one who can get a platform talk at SVP”, I am screwed. I cannot get a talk slot at SVP, despite trying with both hardcore descriptive/comparative palaeo and big-picture evolutionary palaeobiology. (They wouldn’t even give me poster space this year, so my previous working hypothesis — that they like you to pay your dues by doing a poster before they let you loose on an audience — goes down in flames.)

    So let’s use a different definition of “insider” :-)

  11. Andy Says:

    Well, there are “insiders” and there are “Insiders.” Given examples like yours and others, I suspect it can be important to belong to the latter category.

    At any rate, let’s consider this definition: an insider is someone who actively engages in original research, participates in the peer review process at one or more levels (as author, reviewer, or editor), and who is associated with a recognized institution (museum, university, national park or monument, etc.).

  12. Mike Taylor Says:

    That is what I’ve understood by the term … if it’s a useful term at all, that is: my feeling is that it’s only ever used by people who consider themselves outsiders, and then with some bitterness.

  13. Casey Says:

    I think there is a place for citing blogs somehow, but I think the “pers comm.” may be the best mechanism so far. If i read something of interest on a blog, that’s new, and relevant that someone writes about, I’d be inclined to write that person to rediscuss the topic and how to make sure they’re treated appropriately. I don’t need an new mechanism, I don’t think ?. Many people are familiar with that process (and examples of lackthereof). Frankly, in the latter case, I think you have to take the punches and beat them straight up (i.e., with a nice published rebuke). There are certainly other avenues to pursue when this behavior gets out of hand and more unprofessional too. But let’s never forget the persistent whispering campaigns that are common among Vert Paleo folks (let alone other fields) that are of course, fun and informative and a good way to state your case too. Regarding appropriate citation/ ackowledgment, often, perhaps too often, times you’ll hear “oh, they did that to you too? lol, welcome to the club”.

    What i’ve always found interesting, aside from just the straight-up science, is the back story that always seems to go along with many publications/projects and their respective histories. I love stories about how stories came together. Which is why the Neck posture paper piqued my interest so much, because it seemed there was a lot more story than just the paper.

    Also, certain blogs do meet some level of respectability. I’m not sure if its this one ;) but i think that’s really a personal choice as to how well you chase down what is blogged about yourself. Unless you want a peer review process, and all of its problems, for blogs. Because indices and ranks etc also always have problems. The same goes for real pubs from Insiders (with a capital I or not) that have big jobs, that don’t necessarily do the greatest work and consistently take advantage of other people. As noted previously on this and other blogs, and long before the blogosphere, just ’cause its pubbed doesn’t mean its necessarily good. The more informed of a reader you are, the better decisions you can make. I like this blog because many of the posts are about anatomical process, such as how to interpret function, was this fossa for muscle, air, gland, fat etc. I don’t actually care much about sauropod vertebrae (sorry), but its the story that counts.

    I’m just going to assume that I count as an insider (that also got rejected from an oral at SVP–i checked the “if no oral, then nothing” box) i guess they meant it :). I have a univ. job, i write and get grants rejected, and i’m slowly but surely getting papers out, just like every other “insider” should, because our jobs depend on it, perhaps moreso than “outsiders”. But I’ve never liked this insider/outsider/ivory tower dichotomy crap that perpetuates the DML and sometime Vertpaleo list (maybe this belongs on Andy’s blog–jacked!), and hopefully not here much more. I’ve worked for a a couple different museums, diff schools, and I’ve met a lot of great people in all walks of paleo (weekend hunters, garage preparators, family-trip planning enthusiasts,class trips, artists, etc. all the way up to dept. chairs, emeritus faculty, etc) and everyone’s got some niche. be they bastards, boy/girl scouts, or somewhere in between.
    I just try to avoid the bastards, or avoid being one the best i can, and go from there.
    –Casey

  14. Mike Taylor Says:

    Casey says: “I love stories about how stories came together. Which is why the Neck posture paper piqued my interest so much, because it seemed there was a lot more story than just the paper.”

    I am guessing that there is just as much backstory to most papers as there was to this one — the only difference is that usually you don’t get to see any of it.

    When Matt, Darren and I have discussed the nature of those ten or so neck-posture blog posts that followed publication of the paper — whether they’re more like little informal papers or big pers. comm.s or whatever — we realised that we’d been thinking of that series of posts as basically being unofficial Online Supplementary Information. I think that’s a good model for this kind of material: like official OSI, it’s tied to a formal publication, and it expands on why that publication says what it does, but it is not itself citeable, at least under normal circumstances. (Does anyone here know of instances where a formal publication cited a paper’s OSI rather than, or as well as, the paper itself?)

  15. David Marjanović Says:

    If an “insider” is defined as “one who can get a platform talk at SVP”, I am screwed. I cannot get a talk slot at SVP,

    Nobody can get a talk slot at SVP meetings anymore.

    My abstract for this year’s was accepted as a poster. I wrote to Jason Head, asking him if the situation was really that bad, and detailing at length how controversial the subject (the origin(s) of the extant amphibians) was and how much had been published on it in the last few years, and he wrote back “yes, it is that bad” — he had heard my talk last year and enjoyed it, but this year they got 100 more submissions for oral presentations than the new & increased number of slots for them. I’ll have to try a poster. :-(

    (They wouldn’t even give me poster space this year,

    :-o
    *shock*

    Will you be able to attend the meeting at all, then?

    so my previous working hypothesis — that they like you to pay your dues by doing a poster before they let you loose on an audience — goes down in flames.)

    I applied last year for the first time and, as mentioned, immediately got an oral presentation.

    (Does anyone here know of instances where a formal publication cited a paper’s OSI rather than, or as well as, the paper itself?)

    Yes, I’ve cited some of the 11 (!) online appendices of my 2007 paper in addition to the main text, and I’ve cited appendix 16 of this dissertation.

  16. Matt Wedel Says:

    By the way, are Camarasaurus verts actually diagnosable to species, as the caption suggests?

    Sorry to not get to this sooner. I suspect that these verts are identified as Camarasaurus because they were found articulated or associated with definitive Camarasaurus remains. Mike can probably say for sure. According to Wilson and Mohabey (2006), sauropod axes differ among taxa but not usually not in phylogenetically useful ways.

    Wilson, J.A., and Mohabey, D.M. 2006. A titanosauriform (Dinosauria: Sauropoda) axis from the Lameta Formation (Upper Cretaceous: Maastrichtian) of Nand, central India. Journal of Vertebrate Paleontology 26:471–479.

  17. Mike Taylor Says:

    But Matt, Vertebrat’s question was whether the atlas-axis is diagnosable to the species level — not whether it’s truly Camarasaurus but whether it’s C. grandis and not, say, C. supremus.

    And, Vertebrat, the answer is that I don’t know. All I’ve done here is reproduce the species that the specimen is catalogued under. I’ve never really looked into the species-level taxonomy of Camarasaurus — in fact, no-one has really addressed this matter using a phylogenetic approach analogous to the way Upchurch at al. (2005) approached the species-level taxonomy of Apatosaurus. As Darren and I mentioned in the Xenoposeidon paper (Taylor and Naish 2007:1555), “morphological differences between specimens suggest that the genus may have been over-lumped”, but that’s as far as I’m prepared to venture without doing a lot more work first.

    By the way, I’m not aware of anyone else out there working on Camarasaurus at the moment — that is, working on it as a taxon of interest in its own right, rather than using its fossils to work on other problems. There’s a nice big project there for someone who wants it.

  18. Matt Wedel Says:

    But Matt, Vertebrat’s question was whether the atlas-axis is diagnosable to the species level — not whether it’s truly Camarasaurus but whether it’s C. grandis and not, say, C. supremus.

    Yeah, I got that. My point was that the ID, at whatever level, is almost certainly attached to some more diagnostic elements, and the axis/atlas complex is just riding along. The species level ID might be a guess, or optimistic reaching, or–most likely–made by Jack McIntosh, who has probably spent more time in that collection than anyone else, and knows more about Camarasaurus species than anyone else.

  19. LeeB Says:

    Didn’t somebody by the name of Ikejiri do work on Camarasaurus and find their dorsal vertebrae to be specifically diagnostic?

    LeeB.

  20. Mike Taylor Says:

    LeeB, I assume you’re referring to the Ikejiri, Tidwell and Trexler chapter in “Thunder Lizards”? I had that book in mind when I wrote “… working on it as a taxon of interest in its own right, rather than using its fossils to work on other problems”: it has four or five Camarasaurus papers, but they’re not about Camarasaurus, they’re about other topics — such as ontogenetic variation — and Cam just happenes to be the taxon used in the study (for the good reason that there are so many more specimens of it than of anything else). It’s true that the Ikejiri et al. chapter has some comments on species determination, but it doesn’t choose to take on that particular problem head on.

  21. Casey Says:

    Mike Said—I am guessing that there is just as much backstory to most papers as there was to this one — the only difference is that usually you don’t get to see any of it.—

    I think you’re right Mike, and I like it.

    As for SVP, one can’t argue with the growing number of submissions. That’s great. And Jason is good about responding to questions about rejections, he responded quickly adn thoroughly to me too. Because this year’s meeting cooccurs+ with SVPCA, which likely contributed to the large # of submissions, it should be excellent, and a bummer to miss. But also there may have been a good argument for an added extra day because of the joint nature, large submission rate, and other “firsts” of the meeting.

    Casey


  22. There is some unhappiness over here in the UK at the way SVP has swallowed SVPCA whole — although it is, just, being branded as a joint SVP/SVPCA meeting, I think you’re going to have to look very hard to see any SVPCA influence. To be sure, that’s not SVP’s fault — it’s inevitable when a meeting of 4000 people collides with one of 125 people. SVPCA attendees took a vote on whether to merge the meeting this year with SVP or with EAVP, which is much closer in size to SVPCA — it amazed me, and still does, that the vote went in favour of the SVP merge, but there you go.

    Anyway, none of this changes the fact that I, like most British palaeontologists, am delighted to have the meeting happening on our doorstep — really close, in my case, since the SVP meeting in Bristol will be only thirty miles from my house. It’s literally a once-in-a-lifetime thing, so I won’t be letting a little thing like total exclusion from the scientific program spoil my enjoyment.

  23. LeeB Says:

    Mike,

    I found what I was thinking of.
    There is an abstract by Ikejiri from a 2002AM GSA conference on the internet.
    It is paper No. 187-18 called biostratigraphic and geographic distribution of Camarasaurus (Dinosauria, Sauropoda) from the Morrison formation.

    In it he says that dorsal vertebrae of Camarasaurus are their most diagnostic features.

    LeeB.


  24. […] by Mike Taylor on the issue of crediting qualitative contributions of a scholarly nature titled “Yet more uninformed noodling on the future of scientific publishing and that kind of thing&#8…. The blogosphere seems to have more of these “rating blog posts/comments” discussions […]


  25. […] in the context of the debates / random thoughts going on at SV-POW! right now (try, here, here and here for […]


  26. […] 20, 2009 Weren’t we just discussing the problem of keeping up with all the good stuff on da intert00bz? The other day Rebecca […]


  27. What is YPM 1905 ,I can not understand. Please tell in detail.

  28. Mike Taylor Says:

    insider outline, do you mean that you don’t know what the term “YPM 1905” means (it means specimen #1905 of the Yale Peabody Museum collection), or that you don’t know what kind of bone it is (the atlas and axis are the first two vertebrae of the neck)?


  29. […] lot (see discussion here), and the fusion of the atlas to the axis is not unheard of (see here and here), fusion of the middle or posterior cervicals is rare. Which makes intuitive sense–presumably […]


  30. […] is the value, then? Well, I’ve mentioned Jerry Harris several times before as someone whose reviews are full of detailed, helpful comments that really do improve […]


  31. […] Yet more uninformed noodling on the future of scientific publishing and that kind of thing […]


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: