Unhappy thoughts on student projects at SVPCA 2015
September 4, 2015
THIS POST IS RETRACTED. The reasons are explained in the next post. I wish I had never posted this, but you can’t undo what is done, especially on the Internet, so I am not deleting it but marking it as retracted. I suggest you don’t bother reading on, but it’s here if you want to.
There were some surprises in the the contents of the SVPCA programme this year. Sauropods were woefully under-represented with only two talks (mine on apatosaur neck combat and Daniel Vidal’s on the range of movement of the tail of Spinophorosaurus). In fact non-avian dinosaurs as a whole got short shrift, with two theropod talks, three ornithischian talks and one on dinosaur diversity. This is partly, of course, because so many dinosaur workers among the SVPCA mainstays were absent for one reason or another: Matt Wedel, Paul Upchurch, Paul Barrett, Richard Butler, Roger Benson, Steve Brusatte, David Norman, the list goes on.
But that’s OK. I’ve often found, to my surprise, that the dinosaur talks aren’t always my favourites anyway. (Oddly enough, fish talks can quite often catch my imagination; and pterosaurs are always good for a laugh.)
A more surprising development was the complete absence of any finite element analysis this year — a technique that was crazy trendy a couple of years ago, but seems to have come to the end of its fashion cycle.
Instead, I felt that the talks were strongly dominated by one technique: principal component analysis (PCA). As a technique, I have mixed feelings about it: I don’t go as far as John Conway, who as far as I can tell thinks it’s almost literally meaningless. But I have strong reservations about the plug-and-play way it seems to get used for pretty much everything at the moment, and how very tenuous some of the inferences are that people derive from their morphospace plots. It’s difficult to be specific without criticising individuals, which I’d like to avoid doing. But I do think think that when we draw sweeping and heterodox conclusions about an animal’s lifestyle from a PCA of a single facet of a single bone, the validity of that conclusion is, to put it politely, open to question.
In fact an awful lot of the projects presented in this year’s talks seemed to follow the same template. In an idle (and, yes, unnecessarily snide) moment, I sketched an Automatic Masters Project Generator for lazy supervisors. You just throw four dice, then pick your technique name, body-part, period and taxon from these tables:
Table 1: roll 1d6 for a technique
- 2d landmark analysis
- principal component analysis
- geometric morphometrics
- morphospace analysis
- finite element analysis
- ecomorphological diversity analysis
Table 2: roll 1d6 for a body part
- quadrate
- mandible
- sacrum
- pelvis
- ulna
- astragalus
Table 3: roll 1d6 for a period
- Permian
- Mesozoic
- Jurassic
- Late Cretaceous
- Eocene
- Miocene
Table 4: roll 1d6 for a taxon
- lamnid sharks
- sauropterygians
- ornithopods
- corvids
- mustelids
- golden moles
Try it yourself! Morphospace analysis of the ulna in Miocene mustelids! Ready, steady, go! Your Masters degree will be ready as soon as you can talk reasonably coherently about this combination for fifteen minutes and leap to an obvious but weakly supported conclusion based on vague shapes drawn on a PC1-vs.-PC2 plot that captures only 32.6% of the variation!
(To be clear: I am not saying that PCA is intrinsically worthless. As I found myself repeatedly arguing in pubs with John and others, it’s evidently a very powerful tool for discovering correlations. For me, it goes wrong when very weak results pop out, but are given a veneer of respectability and objectivity because a computer was involved in the process.)
As the week went on, I found myself worrying increasingly about these projects. It’s not just that they are (with a few creditable exceptions) samey to listen to and uninteresting in their results. I worry more that these projects kill the interest of the people who take them on. I may be reading my own biases back into my observations here, but it seemed to me that I detected a distinct lack of enthusiasm in several of the speakers, and my hunch is that for a lot of them this will be their first and last SVPCA. They presumably went into palaeo because they loved some specific extinct taxon; instead, they found themselves spending a year staring at a hundred almost identical photographs of moodily lit tubes of toothpaste. And really, if anything is going to kill the passion of a pterosaur lover stone dead, it’s taking measurements of the distal articular facets of the ulnae of a 154 Miocene mustelids.
So I found myself longing for more talks about taxa, and about ideas, rather than techniques. Most obviously, there was very little pure descriptive palaeontology to be seen this year. But also, our own talk aside, very little of what I would think of as exploratory work — thinking about structures, chewing through their implications, considering alternatives. In short: the fun stuff. I would hate palaeontology to be reduced to a process of harvesting data from specimens (looking only at the aspects needed to fill in the matrix), pouring that data into a sausage machine, and turning the handle until something statistically significant comes out.
We have to be able to offer grad-students more than that. We are, and I say this with all due objectivity, in the most exciting science in the world. People go into palaeo because they love it. I wouldn’t like to think they go straight back out of it, as soon as they have their higher degree, hating it. We need to get students looking at and thinking about and discussing actual specimens — proposing ideas, arguing about them, running into reasons why they might be wrong, figuring out why they might be right after all, putting together an argument. Not sitting in front of computers full time running T-tests.
Of course there is a role, and an important one, for numerical methods. But they have to be the means, not the end. We have to have a more interesting goal than finding a statistically significant correlation. Otherwise we’re going to lose people.
September 4, 2015 at 3:02 pm
I’ve definitely seen this kind of talk many times and I think you’re right that many students would be happier doing some more exploratory or descriptive work. From what I’ve talked with my advisor about part of this comes from difficulty in getting funding for exploratory or descriptive work. In the US at least competition for grants in paleontology has gotten to the point where you have to have specific testable hypotheses stated for a chance. PCA and other analyses are cheap, relatively easy to accomplish, and provide a specific end point that you could point to in a grant or progress report.
None of those points are any good for science or students but that unfortunately that doesn’t make them any less true.
September 4, 2015 at 3:03 pm
I must be a hard ass. I expect students to come with their own project and to be able to define why. If they come asking for a project I send them away to come back once they have identified what interests them.
September 4, 2015 at 4:36 pm
I worry less about the abundance of limited scope studies using one small subset of biomechanical analysis, which surely could be condensed with other parts into a whole. I worry more about the dismissal as valuable or relevant the biomechanical analyses themselves. It is easy for one to say that a given system seems or looks or resembles such and so, but to not do work that demonstrates this or to say that the work that goes into properties of systems and their relationships to shape, strain on loading at various angles, etc, is something else. That, too, is data; and in fact it is more data than the suppositions I see consistently published as such.
I am biased, as a lover of biomechanics I cannot help but see the value of even boring work that goes into deep strain analysis of bone tissue, limbs, vertebrae (something that would be relevant given your talk on apatosaur necking, where looking at relative effects of muscle strain would or would not account for the shapes of vertebrae) and their overall morphology.
I’m reminded more of Frank Lloyd Wright who designed many pretty buildings, but had the worst sense of engineering, and tried to force designs without regard to the costs and properties of the materials involved, leading to some expensive and destructive results. He’s a good artist, but a terrible builder.
—
Art about animals, whether it’s descriptive or illustrative, is and should always be the end product of a subject considering biomechanical properties. The meaning behind systems, rather than just the systems as they are. You focus a lot on comparisons, but without underlying biomechanical relationships you don’t justify the particulars of the relationship. As an example, take the convergent features of osteoderms, skull roof ossification and fusion, rib expansion, and loss of cranial openings in the lineage leading up to turtles.
The short of this all is: It is not enough to consider that it happened, when the relative weight of it happening can be weakened by being plastically easy to reverse, suggesting problems the strength of weighting or unweighting characters. This is an issue in other parts of phylogenetic as well, and it crops up in a lot of “interesting” descriptive work.
September 4, 2015 at 8:04 pm
Not meaning to derail the discussion, but masters students can now generate a title for their paper. http://www.generatorland.com/usergenerator.aspx?id=11206
September 4, 2015 at 8:31 pm
Jaime writes:
Well, I do agree. As I hope was clear in what I wrote above, I think that PCA and most of the other techniques that we’re seeing so much of do have their place as tools that help us to evaluate our hypotheses more effectively and rigorously. When kids ask me what they should be doing now to become palaeontologists later, the first thing I tell them (after study hard in school) is to take any stats course that’s offered.
The point is that the techniques we’re talking about here are, properly considered, tools that allow us to bring rigour to our thinking. They are not intended to become substitutes for thinking — but that is the scenario that I fear.
I would rather say more or less the converse of this. I think that good science begins with “art” — that is, informed speculation, the back and forth of ideas, their refinement and modification. That stuff is the raw material that the tools (PCA, cladistics, etc.) allow us to work more effectively with. Start with the ideas; then move to the techniques. If all you know is techniques, how are you ever going to have ideas?
September 4, 2015 at 9:26 pm
Finite element analysis seemed to make sense and I could easily imagine ways it could be useful in estimating the capabilities of extinct fauna.. Principle component analysis made my eyes glaze over so fast I could scarcely believe it. And my house is at 1st and Euclid.
I agree that science should be more fun than this. Orthogonal transformation into linearly uncorrelated values? (What?) Is there a sand box nearby that I can stick my head into? A cat box will do.
September 4, 2015 at 10:21 pm
*Principal* component analysis
September 4, 2015 at 10:24 pm
Thanks, David; now fixed.
September 6, 2015 at 12:24 pm
IMHO (here, take this salt), It is difficult for advisors to have grad-students do the more “arty” work – the descriptive work, the thought exercises, the exploratory biology – because that kind of work has a very difficult time being published.
As jamesboy2013 above stated, grant competition is increasingly fierce, and studies/projects without specific analyses (and specific “concrete” applicable results) are going to have a hard time being funded, especially considering the funding rate of some of these grants. Publications are the same – it is increasingly difficult to get a paper published that doesn’t have analysis on top of analysis. You HAVE to use a computer to run SOME KIND of stats SOMEWHERE… you’ve gotta have at least one p-value floating in your paper to publish! Descriptive, exploratory biology is a very hard sell (I am brought to mind of M. Witton’s recent paper on the locomotor abilities of basal pterosaurs, and the difficulty he had in getting it published, even with an analysis – I recall a reviewer asking him to cut the rest of the “non-sciency” stuff out of it [paraphrasing there of course]).
I’ve run across this myself, with my first (and to date only) paper (shameless plug here: http://palaeo-electronica.org/content/2015/1231-drimolen-makondo-fauna ). It is a purely descriptive account of specimens from a new Pliocene site that is adjacent to a hominin-bearing cave deposit. We spent half a year shopping it to various journals, each time being turned down do to the lack of analysis – even with the journal that eventually accepted it we were told that there is “scant science” in the paper. Well, at least the specimens are now out and published on. These are the basal building blocks of mine (and hopefully other’s) analyses – if you want to do PCA while comparing changes in pedal morphology in closely-related taxa with different locomotor styles, we have a hunting-hyaena foot. You wanna look at big cat metatarsals? We can help you with that. Changes in primate facial structure? We’ve some monkey faces. So on, and so forth.
Anyway, I digress. It’s difficult to get work done and out there without these computer analyses. Grants and publications are becoming increasingly hard to get, increasingly narrow in their scope, and thus promote an increasingly monotypic format/response from the scientists.
That’s my two cents.
September 6, 2015 at 2:44 pm
Astronomer here: I can see the same sort of trend in my field. PCA has been used for a decade or so, sometimes with good reason, sometimes not. In a wider sense, the number of papers which use statistical analysis to make some (usually weak) claim about a large dataset has certainly increased.
One reason may be that in astronomy, the size and scope of freely available databases has increased enormously. When there are billions and billions of stars and galaxies with 5 or 10 or 20 measured parameters (a few of which may actually be statistically significant, but who cares about that?), the number of questions that one can ask and answer using those numbers is, well, almost infinite.
Moreover, it is much cheaper, quicker, and safer to tell a student “grab some numbers from these catalogs and run a statistical analysis” than it is to say “apply for time to make some measurements yourself, and hope that the skies are clear that week.” One can easily generate a Ph.D. dissertation in 4 years when there is no risk of failing to win telescope time, or failing to acquire the measurements …. or having to spend the time to learn the techniques by which scientists actually use instruments to make measurements.
Yes, we’re raising a generation of students who simply grab numbers from the Internet, implicitly trusting them all to be correct and meaningful, and generate a series of statistical tests that may be vaguely useful.
To be fair, it may be easier for a newly minted Ph.D. to get a job if she can show expertise using R to run sophisticated statistical analyses of giant datasets (as used in financial markets or automobile engineering or …), than if she can say “I know how to adjust the alignment of the Declination axis of a telescope, and how to compute the proper exposure time for high-redshift galaxies.”
September 6, 2015 at 3:48 pm
Well, StupendousMan, as a fully paid-up open-data advocate, I am all in favour of making large data-sets available for people to work with. I suppose the question about the astronomers you’re referring to here is: are they learning how to ask interesting questions, how to determine which techniques are appropriate for answering those questions, and how to interpret the answers they get (including “no signal”)?
September 6, 2015 at 5:13 pm
(The penultimate paragraph in my previous post was written with tongue-in-cheek, surrounded by angle-brackets surrounding “Curmudgeon”, but apparently WordPress takes them seriously. Whoops).
I thought that my point — more and more young scientists are doing research at a greater remove from the collection of measurements — was similar to the point of view expressed by you and several other commenters. Perhaps I misunderstood; was your discontent centered on the particular type of analysis chosen by the students at the conference? If instead of using PCA, they had written papers based on, oh, least-squares fitting as a method to create classifications, would you have been happier?
One of my concerns is that there will come a day when students (and many of their advisors) will be unable to evaluate the reliability of the numbers in those on-line databases.
Your last question — “how to interpret the answers they get (including “no signal”)” raises a new troubling thought: if any scientific community, astronomy or paleontology, is driven by the pressure to publish ‘interesting’ papers, what will happen to those many, many projects which end up yielding null results? Will advisors steer students toward safer topics and so neglect large fields of study? Will students and advisors decide that they need to “spice up” their results? Will journal editors increasingly reject solid papers with null results in favor of provocative but unsound ones?
Ugh. It’s too easy to fall into depression when contemplating the future. Much better to go outside and play.
September 6, 2015 at 7:45 pm
Well, there are several different factors at play here. One is using raw data from pre-existing sources rather than generating your own. In general, I am in favour of that in many contexts — although there is no substitute for looking at a specimen yourself. But what worries me most is the possibility of methods being applied indiscriminately.
But I’ll say no more on that at the moment, as I have another post coming up soon.
Your penultimate paragraph this time raises another very serious issue: the difficulty (until recently) of getting negative results published results in a tremendous waste of effort. It’s one of the key things we need to change. “Pelvis shape does not correlate with arboreality in lizards” is a perfectly good finding, just as worthy of publication as its converse. Scientists should not be penalised for the results of their work, but rewarded the quality of what they did to get those results.
September 6, 2015 at 7:46 pm
On a similar note – I’ve noticed that there are fewer and fewer students in VP these days doing collections or field-based research. Few students actually getting into museum collections, poking around, finding neat stuff nobody’s looked at, working it up – let alone collecting/preparing/curating/publishing their own material, from field to print. I can’t help but notice that this may be correlated with the rise of paleobiology, and the endless treatment of fossil specimens as context-less data points. At one particular unnamed but prominent institution with a strong focus on paleobio and an affiliated, equally prominent museum, collections managers have lamented to me that they rarely see students actually entering/exploring/utilizing collections. Students are certainly learning analytical techniques and that’s fine – but paleontologists should be well-rounded, and know how to interpret a geologic map, record a stratigraphic column (or at least be able to interpret one properly), know basic field and lab techniques, know something about taphonomy, as well as basics like proper handling (or curation) of specimens in collections (like, don’t throw away fossils when you’re done with them, or drop them onto hard surfaces), and of course anatomy, phylogeny, and all that other important stuff.
What’s going on here? Surely, crunching numbers is great and learning the proper technique for testing a hypothesis is important but there’s more to paleontology than analytical techniques alone. I didn’t get into vertebrate paleontology to spend my life doing analytical research without touching a real fossil. I imagine others got into this field as well for some sense of adventure and discovery, the thrill of finding something new and getting hopelessly lost and finding your way back again – if I wanted to spend my life crunching numbers I’d at least become an accountant and make a halfway decent living. No offense to software engineers like Mike, of course! Has anyone else noticed this trend? I’ve seen a few opinion pieces on this very subject in J. Paleo.
September 7, 2015 at 9:04 pm
[…] last post (Unhappy thoughts on student projects at SVPCA 2015) was stupid and ill-judged. As a result of very helpful conversations with a senior palaeontologist […]
September 9, 2015 at 8:05 am
All I’ll say is, as a hopefully-quite-soon-to-be grad student, all these comments about descriptive and exploratory work being extremely difficult to get granted or published is worrying…