That paper that says women are better coders than men but are judged on their gender? It doesn’t say that at all

February 20, 2016

As a long-standing proponent of preprints, it bothers me that of all PeerJ’s preprints, by far the one that has had the most attention is Terrell et al. (2016)’s Gender bias in open source: Pull request acceptance of women versus men. Not helped by a misleading abstract, we’ve been getting headlines like these:

But in fact, as Kate Jeffrey points out in a comment on the preprint (emphasis added):

The study is nice but the data presentation, interpretation and discussion are very misleading. The introduction primes a clear expectation that women will be discriminated against while the data of course show the opposite. After a very large amount of data trawling, guided by a clear bias, you found a very small effect when the subjects were divided in two (insiders vs outsiders) and then in two again (gendered vs non-gendered). These manipulations (which some might call “p-hacking”) were not statistically compensated for. Furthermore, you present the fall in acceptance for women who are identified by gender, but don’t note that men who were identified also had a lower acceptance rate. In fact, the difference between men and women, which you have visually amplified by starting your y-axis at 60% (an egregious practice) is minuscule. The prominence given to this non-effect in the abstract, and the way this imposes an interpretation on the “gender bias” in your title, is therefore unwarranted.

And James Best, in another comment, explains:

Your most statistically significant results seem to be that […] reporting gender has a large negative effect on acceptance for all outsiders, male and female. These two main results should be in the abstract. In your abstract you really should not be making strong claims about this paper showing bias against women because it doesn’t. For the inside group it looks like the bias moderately favours women. For the outside group the biggest effect is the drop for both genders. You should hence be stating that it is difficult to understand the implications for bias in the outside group because it appears the main bias is against people with any gender vs people who are gender neutral.

Here is the key graph from the paper:

TerrellEtAl2016-fig5(The legends within the figure are tiny: on the Y-axes, they both read “acceptance rate”; and along the X-axis, from left to right, they read “Gender-Neutral”, “Gendered” and then again “Gender-Neutral”, “Gendered”.)

So James Best’s analysis is correct: the real finding of the study is a truly bizarre one, that disclosing your gender whatever that gender is reduces the chance of code being accepted. For “insiders” (members of the project team), the effect is slightly stronger for men; for “outsiders” it is rather stronger for women. (Note by the way that all the differences are much less than they appear, because the Y-axis runs from 60% to 90%, not 0% to 100%.)

Why didn’t the authors report this truly fascinating finding in their abstract? It’s difficult to know, but it’s hard not to at least wonder whether they felt that the story they told would get more attention than their actual findings — a feeling that has certainly been confirmed by sensationalist stories like Sexism is rampant among programmers on GitHub, researchers find (Yahoo Finance).

I can’t help but think of Alan Sokal’s conclusion on why his obviously fake paper in the physics of gender studies was accepted by Social Text:it flattered the editors’ ideological preconceptions“. It saddens me to think that there are people out there who actively want to believe that women are discriminated against, even in areas where the data says they are not. Folks, let’s not invent bad news.

Would this study have been published in its present form?

This is the big question. As noted, I am a big fan of preprints. But I think that the misleading reporting in the gender-bias paper would not make it through peer-review — as the many critical comments on the preprint certainly suggest. Had this paper taken a conventional route to publication, with pre-publication review, then I doubt we would now be seeing the present sequence of misleading headlines in respected venues, and the flood of gleeful “see-I-told-you” tweets.

(And what do those headlines and tweets achieve? One thing I am quite sure they will not do is encourage more women to start coding and contributing to open-source projects. Quite the opposite: any women taking these headlines at face value will surely be discouraged.)

So in this case, I think the fact that the study in its present form appeared on such an official-looking venue as PeerJ Preprints has contributed to the avalanche of unfortunate reporting. I don’t quite know what to do with that observation.

What’s for sure is that no-one comes out of this as winners: not GitHub, whose reputation has been wrongly slandered; not the authors, whose reporting has been shown to be misleading; not the media outlets who have leapt uncritically on a sensational story; not the tweeters who have spread alarm and despondancy; not PeerJ Preprints, which has unwittingly lent a veneer of authority to this car-crash. And most of all, not the women who will now be discouraged from contributing to open-source projects.

 

Advertisements

12 Responses to “That paper that says women are better coders than men but are judged on their gender? It doesn’t say that at all”

  1. PedroS Says:

    “I think the fact that the study in its present form appeared on such an official-looking venue as PeerJ Preprints has contributed to the avalanche of unfortunate reporting.”

    I am quite dismayed at how the publicity to this flawed preprint may hurt the perception of preprints in general and PeerJ Preprints in particular. I do wonder, though, how much the “avalanche of unfortunate reporting” is due to the “official-looking venue”. Most of it is (in my view) due to pervasive herd mentality in science “journalism”, where every site copies everybody else, critical thinking (or even basic knowledge of the scientific method) is lacking and catchy titles/overhyped findings are almost required for a study to be noticed. I strongly doubt that a dozen non-correlated science journalists follow preprint sites daily and just hapenned to find this preprint at the same time.

  2. Mike Taylor Says:

    I’m sure you’re right, PedroS, that what we’re seeing is “churnalism” — the uncritical reproduction in many news venues of a story first appearing in one. And yet whatever minimal checking these venues have done will likely have started by following the link to the paper. And the fact that it’s on a “proper scholarly web-site” rather than a blog will have lent it some respectability.

    I’ll say again that I don’t know what conclusion I draw from that. I certainly don’t think that PeerJ Preprints should be making judgements about submitted manuscripts before posting them. And they already have big red “NOT PEER-REVIEWED” banner at the top of the page, which ought to alert people to the nature of what they’re doing.


  3. […] course, the actual study demonstrated absolutely nothing of the kind. It showed that coders were judged more harshly whenever they revealed any gender. In the multiple […]


  4. This paper fits well with the pervasive narrative of “systemic sexism/racism in field X” that has been running rampant through the media over the past few years. I’m not surprised that this study would get picked up and blasted out as “yet another example of X.”

    I’m not a fan of pre-prints in general, and this study certainly doesn’t do much to help the reputation. I think Peer J should make it more clear on the site that pre-prints represent work that is not yet ready for prime time (and thus, should not be reported on). Maybe a special pre-print section of the site, or something similar where one has to go out of there way to find it rather than see it on the front page.

    All that said, I wouldn’t be surprised if a study like this did get published in one of the Sociological journals. The kind of things that pass for science in those journals is pretty mind boggling.

  5. Mike Taylor Says:

    Jurassosaurus, I’m not sure what more you think PeerJ should beyond the present policy of the big red “NOT PEER-REVIEWED” banner at the top of the page.


  6. The banner is a fine idea, and though it may be red, it is far from big. It’s very easy to overlook. Especially for journalists that just skim the article (if even that). Bumping up the font by a few points would help (make it on par or slightly smaller than the title print). Maybe even adding the “not peer reviewed” bit to the Pre-print button.

  7. Mike Taylor Says:

    I’ve suggested this to the PeerJ guys. We’ll see what happens. https://twitter.com/MikeTaylor/status/701158456791867393

  8. PedroS Says:

    jurassosaurus said

    “I think Peer J should make it more clear on the site that pre-prints represent work that is not yet ready for prime time (and thus, should not be reported on). Maybe a special pre-print section of the site, or something similar where one has to go out of there way to find it rather than see it on the front page.”

    The PrePrints section in PeerJ are a special, pre-print only part of the site. Preprints are not on the front page at all.

    “The banner is a fine idea, and though it may be red, it is far from big. It’s very easy to overlook. ”

    Every single page of the PeerJ Preprints PDF contains a header stating “not peer reviewed ” in not-small uppercase (i.e. ” on par or slightly smaller than the title print”). A “journalist” will only overlook that if they actively want to.

  9. Mike Taylor Says:

    Jurassosaurus suggested: “The banner is a fine idea, and though it may be red, it is far from big […] Bumping up the font by a few points would help.” And I tweeted the idea to PeerJ.

    On Tuesday 23rd February (three days after the tweet), this change was implemented — as you can see on the preprint page that started it all.


  10. Nice! That’s is much better. Kudos to PeerJ for listening to the criticism and making the improvements.


  11. […] code but were less likely to have it accepted. The truth is more complicated: here’s one discussion including the key graph. (via Heather […]


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: