Count the covariates: A proposed simple test for research consumers

Noah Haber
Trying to determine if a study shows causal effects is difficult and time consuming. Most of us don’t have that kind of time or training (yes, that includes almost all medical professionals too). I have a proposed idea for a potential test that anyone can do for any article linking some X to some health Y, and I want to hear your thoughts: count the covariates.

TL;DR: You may be able to get a decent idea of whether or not the study you just saw on social media about X linking to some Y shows a causal relationship by counting the number of covariates needed for the main analysis. The fewer variables controlled for, the more likely the study is to be interperetable as having strong causal inference. The more covariates, the more likely it is to be misleading.

A few important caveats: 1) THIS IS CURRENTLY UNTESTED, but we are currently working on formally testing a pilot of the idea; 2) It will certainly be imperfect, but it might be a good guideline; 3) This is probably only works for studies shared on social media; and 4) This is an idea intended for people who don’t have graduate degrees in epidemiology, econometrics, biostats, etc., but the more you know, the better.

Why it could work:

The key intuition here is twofold. A study that is “controlling” for a lot of variables 1) is usually trying to isolate a causal effect, regardless of the language used; but 2) can’t.

Let’s see why this might work, using that coffee study from last week as an example.

Controlling for a lot of variables implies estimating causal effects

The logic comes down to what it means to “control” for something. For example, smoking. The reason the authors control for smoking is because smoking messes with their estimation of the effect of coffee on mortality. People who drink more coffee also more likely to smoke. Smoking is bad for you. One reason, then, that people who drink more coffee might have different life expediencies is because they are likely to die earlier from smoking. So it makes sense to “control” for smoking then, right?

It does make sense, if you are trying to isolate the effect of drinking coffee on mortality. If you don’t care about that cause effect, and have some other reason to want to know this association, you generally don’t need or want to control for other variable. The more variables you control for, the less plausible it is that you are doing anything other than estimating a causal effect.

Controlling for a lot of variables implies inadequate methods to estimate a causal effect

Some research strategies get you great causal effect estimation without having to control for much of anything at all, such as randomized controlled trials, “natural experiments,” and many kinds of observational data analysis methods in the right scenarios. You can’t always do this successfully. Sometimes, you have to control or adjust for alternative explanations.

The problem is when you have to control for a LOT of alternative explanations. That generally means that there was no “cleaner” way to go about the study that didn’t require controlling for so many variables. That also means that there are probably a thousand other variables that they didn’t control for, or even have the data on those variables to start with. It only takes one uncontrolled for factor to ruin the effect analysis, and there are too many to count. There are also some slightly weirder statistical issues when you imperfectly control for something, and that’s more likely to happen when you are controlling for a lot of stuff.

In that coffee study, the authors controlled for the kitchen sink. However, coffee is related to basically everything we do. People from different cultural backgrounds have different coffee drinking habits. People with different kinds of jobs drink coffee differently. Fitness. Geographic region. Genetics. Social attitudes. You name it, and it is related to coffee. That’s not a problem by itself. What IS a problem is that all of those things ALSO impact how long you are going to live. If you have to control for everything, you can’t.

Count the covariates

To review: controlling for a lot of variables implies that you are looking for a causal effect, but ALSO implies that there is more that needed to be controlled for to actually have estimated a causal effect. See the catch-22?

We can also take a look at causal language here as well. Studies are often considered acceptable in scientific circles (i.e. peer review in journals) as long as they use “technically correct” language with regard to causality. We think that is seriously misleading, but that doesn’t stop those studies from hitting our newsfeeds.

The most likely scenario for most people seeing a study that uses strong causal language and controls for very little is that it’s one of those studies that actually can estimate causality, such as most randomized controlled trials. On the other hand, a study that uses weak causal language and controls for very little probably isn’t actually trying to estimate a causal effect, and our proposed rule doesn’t really say much about whether or not these studies are misleading.

We can also look at the language used, where studies may use stronger (effect/impact/cause) or weaker (association/correlation/link) causal language. It’s also worth considering how the authors state their evidence can be used, as that can also imply that their results are causal. The kinds of studies that control for a lot of variables and state it as such are a strange bunch and unlikely to be seen in your social media news feed. This rule just doesn’t work as well for them, but most are unlikely to see them anyway, so the rule is still mostly ok.

Important considerations and discussion

Multiple specifications can make this hard to deal with. In the phrase “number of covariates required for the main analysis,” there are two tricky words: required and main. Most studies have several ways of going at the same problem, and it’s difficult to determine what the “main” one is. It is common that a study might have both a “controlled” and “uncontrolled” version, which may or may have very different numbers produced. If the numbers don’t change much between those two versions (or, even better, you have the background to know what is required and not), controlling for them probably wasn’t “required,” so they may not need to count. It is notable that the coffee study we keep talking about doesn’t do anything of the kind. All plausible main analyses are heavily controlled, and as such would fail any version and interpretation of this test.

There is probably a paradox that occurs here (credit to Alex Breskin for pointing this out). In the case of multiple studies on the same topic using roughly the same methods, observational trials controlling for more covariates probably do better with regard to causal inference. But because we are not selecting among studies in that way, and we are intending this as a guideline for ALL studies on social media, the opposite may be true.

It is also worth noting that this may end up being mostly indistinguishable from RCT vs. everything else, which is not the intent.

There are also some sets of methods which do require moderate numbers of covariates to work, and occasionally these articles appear in our news feeds. One example from Ellen Moscoe is difference-in-difference studies for causal effects of policies. These typically need controls for time and place, which is at minimum two covariates.

We also just don’t know if this idea actually works. But it might, and we can test it.


Any thoughts on why this might fail? Alternative proposed tests? Let us know in the comments or get in touch!

Thoughts and comments welcome