If the principal objective of public health research is to generate knowledge that can be used to improve the health and well-being of human lives, would we be better off in the counterfactual universe in which we increased focused on robust causal inference? Causality is built into the words used to define quantitative health research, including epidemiology, biostatistics, econometrics, and everything in between. When we try to identify the determinants of disease, we are studying “any factor, whether event, characteristic, or other definable entity, that brings about a change in a health condition or other defined characteristic,”1 (emphasis added by the authors of this protocol). Health practitioners, policy makers, and other consumers of research make decisions based on their understanding of the evidence generated by research scientists. When there is discordance between the perceived and actual strength of the causal evidence and relevance to their work, decision makers are unable to make optimal, data-driven choices.
Population health research is inherently complex, requiring a strong methodological background to tease out how and what inference can be drawn from published evidence. Even those with the requisite training often disagree over the interpretation of results. For example, there has been a well-publicized debate over a meta-analysis that analyzed the health benefits, or lack thereof, of certain fatty acids, which has resulted in numerous public criticisms and counter criticisms of the findings.2 More recently, there has been a forceful debate between teams led by Dr. Brian Nosek and Dr. Gary King over whether the poor replicability of experimental psychology trials is meaningful, both statistically and practically. These high-profile debates are over results from randomized trials, which are typically simpler and more causally robust, though may have limited external validity, than the more common observational studies which are prevalent in both traditional and social media. If statistical methodology experts disagree, or even outright reject, the causal validity of studies entering the public sphere, how can we expect health decision makers at all levels to appropriately weigh and disentangle the complexities of causal evidence? Even physicians, who are trained to practice evidence-based medicine, may not be trained in the science of the statistical evidence itself.3
While there are a variety of institutions, including peer review systems for scientific journals, academic career incentives, and independent organizations (e.g. Cochrane) which serve to promote the best scientific evidence by consolidating large bodies of literature and distributing it to decision makers, there are also incentives which may serve to undermine the ability of these institutions to promote the best science. These lie at the heart of the formal and informal institutions in academia, where academic researchers and organizations are incentivized to maximize both the volume and public impact of publications in order to further careers, generate funding, and garner prestige. These incentives include the inter-related emphasis on publication volume, impact factor as a function of wide citation and readership, finding statistically significant results, and lack of pursuit and publication of statistical insignificance. Furthermore, the issue outlined above may be intensified by emergent changes in internet distributed media.
The proliferation of electronic media, both social and traditional, is rapidly changing the way science is consumed and produced by linking evidence more directly to the research consuming public. Audience size is a strong driving force in internet distributed media, which in turn incentivizes media organizations to overstate and mistranslate scientific findings and select studies which are more eye-catching rather than the studies which are more causally robust. If the factors which drive popularity among research consumers are only weakly related to the strength of the evidence readers are consuming, there may be an incentive to promote weak evidence. Internet distributed media and promotion of open access to scientific journals, while valuable in its own right, also ties research much more directly with the consumer. We could hypothesize a feedback loop such that the research production itself is impacted by the prospect of popular readership, through social promotion and financial incentives, particularly if academic and extra-academic funding and career promotion is itself influenced by popular readership. In that case, the same perverse incentive exists at every level of the content production chain, including the researcher, academic institution, scientific journals, and media.
While there is no robust causal evidence examining to what degree, if any, these incentives actually impact both the studies pursued and published in academia and shared in the media, there is some suggestive anecdotal evidence that this may be the case. A recent study found that over 200,000 patients temporarily ceased taking statins in the UK attributable to press coverage of two studies reporting high adverse event rates for statins, both studies later retracting those statements due to inadequate methodological rigor.4 In 2015, 30 meta-analyses published in the PubMed dataset contained the word “coffee” in the title, let alone individual studies. It is implausible that the volume of publication on coffee is related only to its scientific and public health value, and not influenced by popular readership. The issues outlined above are occurring despite vigorous discussion of biases and failures of statistical academic research.5
This study is a first step towards characterizing the strength of causal evidence in the health research produced by academia and disseminated to the general public through the popular media. We first systematically identify widely shared media articles pertaining to single studies with a health outcome, using Facebook and Twitter social media sharing statistics generated from the NewsWhip Insights platform. We then systematically review the scientific journal research articles mentioned in the news stories for strength of causal inference and the appropriateness of the causal language used in both the original scientific study and the media article(s) reporting on that research.