Thursday, October 4, 2012

What about the data you never see?

Recently, I’ve been thinking a lot about fraud in science. First of all because there is a lot of it, and because a significant amount comes from my home country... But also, because I can partly understand what drives people to commit fraud in science. I recently wrote about the experiment for the paper that is almost done and how getting data consistent with previous results is going to increase the chance of getting this paper in a high impact journal a lot. And we all need many/high impact papers in order to get grants… The experiment is still not done, so I don’t know yet what the results are, but it is kind of ridiculous if you think about it that the same amount of work will yield a different impact factor paper depending on the results.

And then recently I read this blog post from NeuroSkeptic that in animal research only about 50% of the results get published in academia and only about 10% in industry. So 50-90% of the data just disappear into a drawer!

And what about all the times you read that only a certain percentage of animals learned a certain task, or a certain population of cells responded in a certain way. What happened to the rest?

Recently, I heard a story from a grad student in a different electrophysiology lab who had problems replicating results from a post-doc who had been in the lab. He thought he was doing the exact same things and couldn’t understand what the problem was, until the PI told him that this post-doc had very strict criteria for including cells. They had to be within a certain range of membrane potential, input resistance and variability of EPSP amplitude, but the weird thing is HE DIDN’T DESCRIBE THIS IN HIS PAPER. So it seemed like every cell he patched had showed the same response, whereas in reality the data in the paper were only from a small subset of cells. And from the grad student’s experiments it seemed like many cells did not show the reported response… In this case, when it comes to the health of the cell this makes some sense, but the bad thing is that the PI didn’t seem to know how many cells were omitted from the analysis according to these criteria. Perhaps the post-doc had never even shown these data to the PI, because he would only show the ‘good’ data. I don’t know that the latter is true, but it got me thinking about all the experiments that have been done, but that we never hear about. 

This is problematic on two levels: first, it leads to publication bias and many people have thought about how to fix this, for example by pre-registering studies or by publishing negative results.

But I think there is also an issue at the lab-level, because what if people just use half of the data in their publication? Is the PI responsible to see ALL the data that are produced in the lab? Do we need a different system where it is also encouraged to publish more ‘boring’ research? What are your thoughts?


  1. As someone who works with animal models, I both get what your saying and want to defend the fact that sometimes you just can't get the experiments because the animals go 'off'. for instance we recently had our animal breeding facilities moved into an all new center...that was still in the final stages of construction and poof suddenly you can't replicate results any more. The animals were too stressed out by the move and the noise, breeding pairs stopped breeding and results got funky for a while for many people (myself included) because of the stress. And what can you do? keep testing and throwing aside the funky data until hopefully things normalize themselves (it's been a few months and things look good again, the animals are all settled back in)


    1. I totally understand what you mean, since much of my thesis work were animal experiments and about half of what I did was too poor quality data to even write down (as in: too little animals for firm conclusions, or experiments that couldn't be replicated).
      Regarding your experiments that couldn't be replicated due to construction going on: that is actually something that you can publish, like these people did:

  2. I totally agree with you about this, especially about selectivity in what goes into papers... I think it's perfectly okay to look at subsets of cells (or whatever), but any resulting publications have to be specific about the selection criteria and give the appropriate numbers and information.