Do coffee studies make causal inference statisticians die earlier?

Alexander Breskin
Noah Haber
This week, yet another article about the association between coffee and mortality plastered our social media feeds. This trope is so common that we used it as an example in our post on LSE’s Impact Blog, which happened to be released the very same day this study was published. We helped comment on the study and reporting for a post in Health News Review, which focused on how the media misinterpreted this study. Most news media made unjustifiable claims that suggested that drinking more coffee would increase life expectancy. The media side, however, is only half of the story. The other half is what went wrong on the academic side.

In order to have estimated a causal effect, the researchers would have needed to find a way to account for all possible reasons that people who drink more coffee might have higher/lower mortality that aren’t the direct result of coffee. For example, maybe people who drink a lot of coffee do so because they have to wake up early for work. Since people with jobs tend to be healthier than those who don’t, people who drink coffee may be living longer because they are healthy enough to work. However, this study can’t control for everything, so what they find is an association, but not an association that is useful for people wondering whether they should drink more or less coffee.

The study is very careful to use language which does not technically mean that they found that drinking more coffee causes longer life. That makes them technically correct, because their study is simply incapable of rigorously estimating a causal effect, and they don’t technically claim they do. Unfortunately, in the specific case of this study, hiding behind technically correct language is at least mildly disingenuous. Here is why:

1) The authors implied causation in their methodological approach

The analytic strategy provides key clues was designed to answer a causal question. Remember above where we talked about controlling for alternative explanations? If you are only interested in association (and there might be some reasons why you might want this, albeit a bit contrived), you don’t need to control for alternative explanations. As soon as you start trying to eliminate/control for alternative explanations, you are, by definition, trying to isolate the one effect of interest. This study tries to control for a lot of variables, and by doing so, trying to rule out alternative explanations for the association they found. There is no reason to eliminate “alternatives” unless you are interested in a specific effect.

2) The authors implied causality in their language, even without technically saying so

The authors propose several mechanistic theories for why the association was found, including “reduced inflammation, improved insulin sensitivity, and effects on liver enzyme levels and endothelial function.” Each of those theories implies a causal effect. When interpreting their results, they state that “coffee drinking can be a part of a healthy diet.” Again, that is a conclusion which is only relevant if they were looking at the causal effect coffee on health, which they cannot make. How can you say if coffee is ok to drink if you didn’t tell me anything about the effect of drinking coffee?

3) Alternative purposes of this study are implausible or meaningless

Effect modification by genetics

The stated purpose of the study and its contribution to the literature is about the role of genetics in regulating the impact of coffee on mortality. The problem here, again, is that in order to determine the impact of genetics on regulating the effect of coffee on mortality, you first have to have isolated the effect of coffee on mortality. You can not have “effect modification” without first having an “effect.” That’s a shame, because it is totally plausible that there was some neat genetics science in this study that we aren’t qualified to talk about.

Contribution to a greater literature

In general, we should ignore individual studies, and look at the consensus of evidence that is built up by many studies. However, there are literally hundreds of studies about coffee and mortality, almost all of which commit the exact same errors with regard to causation. One more study that is wrong for the same reason that all the other studies are wrong gives a net contribution of nearly nothing. They may be contributing to the genetics literature, but this study does not add any meaningful evidence to the question of whether or not I should have another coffee.

4) Duh.

Studying whether coffee is linked to mortality is inherently a causal question. To pretend otherwise is like a batter missing a swing, and then claiming they didn’t want to hit the ball anyway. Just by conducting this study, a causal effect is implied, but as we already noted this kind of study is not useful for causal inference. This specific issue is unfortunately common for studies in our media feeds, and was one of the reasons we did the CLAIMS study in the first place. We contend that researchers need to be upfront about the fact that they want to estimate causal effects, and to then consider whether or not it is reasonable to do so for the exposures and outcomes they are considering.

We also can not stress enough a more general point: the authors of this study and and the peer review process made a lot of mistakes, but this study does not represent all of of academic research. It is a shame that studies like these are what makes the top headlines time after time again instead of the excellent work done elsewhere.

Can’t we please just accept coffee (and wine and chocolate) for what it is: delicious?

Thoughts and comments welcome