Chat: Trial by Virus

This was an organized chat about how how clinical trials are impacting and are impacted by COVID-19. The original chat took place on April 15, 2020. Since then several major trials have had partially or fully published results. The transcript below is revised and edited from the original for clarity.

Andrew (professor, clinical trial statistician, at University of Pittsburgh): Let’s get ready to rummmmmmmble

Noah (postdoc in meta-science, METRICS): We are starting off the day with the space jam theme song. Not bad. Where’s Darren?

Mollie (postdoc in repro/perinatal pharmacoepi): He’s probably still fighting about random confounding on Twitter.

Darren (epidemiologist and statistician, University College Cork), several minutes later: Wut up.

Noah: Here we go! As is tradition, we start off with an unfair, possibly meaningless question: Given all the clinical trial evidence we have seen to date, how well does any combination of hydroxychloroquine, remdesivir, azithromycin, or any other pharmaceutical work for treating COVID-19? The scale goes from 0 (does absolutely nothing) to 100 (instant cure, COVID-19 solved). Everyone must answer.

Andrew: If I have to give a number, 14.

Darren: Is this scale called a “probability”?

Mollie: Yeesh. Is 50 equipoise?

Darren: Yeah, do over.

Sarah (postdoc and clinical ethicist, philosophy, METRICS): Mutiny on question 1.

Mollie: We are all Spartacus.

Sarah: So why are we all so uncomfortable answering?

Andrew: I think that reaction, itself, kind of sets the tone for this: the whole point is that to date the evidence accumulated for or against any therapy is generally low quality and scattershot.

Darren: Based on existing evidence…7

Noah: What part of “possibly meaningless” was unclear :)?

Emily (professor at the George Washington University, epidemiologist): Yes, it’s a meaningless question. I’ll go with a 10.

Andrew: We’re four months into this, with hundreds of thousands of confirmed cases of the disease, and minimal quality evidence to tell whether any of it works. Inasmuch as we can ever know that something “works”

Darren: But it could be 68 6 weeks from now.

Noah: Slightly revised question then: What is your projection for what your answer will be two weeks from now, within what you would call reasonable bounds (e.g. 10-45)

Emily: 8-10

Darren: 6-8

Sarah: 10-20

Andrew: Two weeks from now? Still 14. Two months from now, maybe we’ll have meaningful movement.Oh, wait, bounds like a “confidence interval” – uh, 11-17.

Mollie: 10-20

Emily: Given the short timeline of two weeks, we’re unlikely to have much additional (high quality) information. The timeline for clinical trial readouts is unfortunately a bit longer than this.

Noah: Well, since no one else is gonna do it, I will: -10 to 40

Mollie: No fair!

Sarah: What am I supposed to make of -10?

Noah: Worse than useless. -10 could happen if no therapies are found to be at all effective for COVID-19, and have at least some negative side effects. So all net negative is totally plausible. <editor’s note: since this chat was recorded on 4/15, early results have mixed impact, some suggesting net harm, and unconfirmed hints at maybe some positive results.>

Darren: So not a probability then ok

Noah: Some of us are fine with probability estimates less than 0 <wink>

Andrew: Oh I see it came from the linear probability model.

Mollie: There’s also individual-level good/harm and population-level good/harm.

Noah: Right! But before we dive into that, what even is a trial?

Sarah: A risk that human subjects take on in order to produce a social good.

Andrew: NIH defines a clinical trial as “A research study in which one or more human subjects are prospectively assigned to one or more interventions (which may include placebo or other control) to evaluate the effects of those interventions on health-related biomedical or behavioral outcomes”

Emily: That’s great. And I like to think of a trial in 4 parts — PICO. Population. Intervention. Comparison group. Outcomes.

Darren: On the twitter I defined trials as a tool for changing people’s minds. The key is that the trial must be trustworthy, so we don’t need to rely on trust in people.

Emily: I would argue that ideally we (clinical trialists) don’t start with something that we want people to do. But rather we start with something we aren’t sure if people should do. Of course we start with a reasonable hypothesis/evidence about why it might be a good idea to do something/use this drug. But we aren’t sure. However, I 100% agree with your point that trials must be trustworthy and transparent.

Noah: So is this what a trial is SUPPOSED to do, vs what how people actually do them?

Andrew: Supposed to do: improve our knowledge of whether the benefits of that drug/device/intervention outweigh the harms relative to some alternative course of action. And, in large part, I think that generally is what they do. But, there are some structural issues that can make them ill equipped to respond quickly to something like a pandemic situation. 

Noah: Which brings us to our current moment. Some scene setting: When I checked last night, Cochrane’s registry had 568 studies listed as “interventional studies” related to COVID-19. We’ve had a few trials, at least a few of which (all of questionable quality) have made it their way into the mainstream. The biggest and most elephant-in-the-roomiest: the Raoult study.

Darren: Not a trial.

Noah: Damn, these takes are getting served up hot.

Emily: 26 patients got hydroxychloroquine (HCQ) (6 also got azithromycin because their doctor suspected infection). 6 of 26 were excluded from analysis bc they were transferred to ICU, died, or stopped treatment. Ultimately, 20 treated patients were compared to 16 patients who refused treatment or were in another facility. The study concludes that subjects treated with HCQ tested negative for SARSCoV2 sooner than 16 people who refused treatment or were at a different facility. No other clinical data was included.

Darren: A group of people were selected through unknown means, and given something. Their outcomes were compared to another group of people, also mysteriously selected. Outcomes in the 2 groups were compared. Then they did 9 other wrong things, and we were all rolling our eyes about the study. Then the President of the United States tweeted about it and here we are.

Noah: So one of the key things: it wasn’t randomized. Why is that so, so important here?

Andrew: Randomization ensures unbiased allocation to treatments, minimizing risk that “sicker” or “healthier” patients are preferentially assigned to one treatment to give appearance that it is better than the other.

Darren: It’s really the only way to feel confident you are comparing like to like

Andrew: Alongside randomization lives the importance of “concurrent controls” – meaning that the people in one group are also in the same hospitals, season, etc versus the other group – so any observed differences are more likely due to the treatment and less likely to be explained by those other factors. What do you call them again? Starts with a “c”

Darren: Colliders

Noah: Cookies. <editor’s note: the correct answer is confounders>

Mollie: One of the Very Special Pharmacoepidemiology terms is “channeling bias”, which occurs when physicians preferentially assign or avoid treatments based on the prognosis of the patient in front of them.

Darren: Nice! I’m adding that to Noise Mining as go-to phrases

Sarah: So Raoult study = worse than useless?

Emily: Yes.

Darren: As it turned out? Very much so it seems.

Mollie: Actively harmful, I’d say.

Sarah: Without randomizing, is there any way they could have at least been useful? Or does it all hang on the failure to randomize?

Darren: This is where trust comes in.

Emily: Yes, and I’ll add one more thing to the list of problems — people who died or went to the ICU were excluded from the final analysis. And this leaves me wondering if HCQ was actually harmful. But we can’t know for sure given the way the information was presented.

Mollie: Also important to note that all the COVID attention to HCQ has made it difficult for patients using it for other conditions (e.g., lupus) to get treatment.

Darren: The PI was advocating HCQ months before. It is a treatment he has long advocated for all kinds of maladies.

Noah: Holding aside issues with the participants in that study themselves, isn’t having some study better than no study?

Andrew: Whether “something is better than nothing” is a big picture question that hangs on some of the other issues here. As noted, it’s created huge demand for the drug based on extremely shoddy evidence, in some cases preventing people who take this drug regularly (with proven benefit for their condition!) from getting their medication.

Emily: No – something (HCQ) isn’t better than nothing – when we don’t know about the safety of the ‘something’ (HCQ).

Andrew: If this study was being treated as “Hey this might work, enroll your patients in trials so we can generate better quality evidence” – it would arguably be positive. But it’s become a polarizing (and politicized) issue where people have swallowed the weak evidence and started openly advocating that all patients with COVID be treated with this therapy. So, that’s how in some cases I’m not sure something is better than nothing in terms of evidence.

Noah: What about for the patients that participated?

Sarah: Yes! Thank you for bringing this up. Health Care Systems reallocating this drug from patients who have been taking it for years in order to use it in COVID trials on the basis of this study was very very concerning.

Andrew: The net effect of the Raoult study (globally) has been taking a drug away from patients who are known to benefit from it so it will be available for patients that may or may not benefit from it.

Emily: They may or may not benefit from it. And they may be harmed from it in fact!

Andrew: And to toss a grenade out there that gets very hot very fast, some very smart physicians on Twitter have basically said “We don’t even have enough of this drug to treat everyone anyway; at least if we randomize while the supply chain is short, we’ll know if it works”

Noah: Alright, clearly more study is not always better. And a bad study can be (and clearly was in this case) worse than nothing. What do we make about the MASSIVE number of new trials that are happening right now?

Darren: It’s good, but they aren’t nearly well coordinated enough

Sarah: I am worried about duplication of efforts and lack of coordination.

Andrew: A reflection of the bizarre structural inefficiencies built into our entire research infrastructure, which then get exacerbated during a pandemic when everyone feels like they have to do something.

Emily: There are definitely organizations (like WHO, the Gates Foundation, COVID therapeutics accelerator, etc.) that are coordinating some of these efforts to make sure they are complementary and informative. But there are many more that are not coordinated. And coordination is key to make sure that studies are harmonized across each part of the PICO. That will allow us to learn faster. To learn better. <editor: harder, stronger>

Sarah: Some institutions have many more proposed trials than relevant participants, which has led to some interesting conversations about allocation which is not about vents for a change.

Andrew: I have heard of a few trials where they are even planning to have a Data and Safety Monitoring Boards – which review interim results from a trial and determines whether the trial should continue or stop based on the accrued data – from separate trials talk to one another, which I believe is quite uncommon.

Noah: I’ve skimmed through the registries, and had a few big takeaways: the vast majority are tiny, single center studies, extremely few are even measuring the same outcomes.

Andew: Noah, your point is bang on. Tons of “trials” being started, the vast majority of which are likely to be too small and too slow to be informative.

Darren: Yet I’ve already spotted a systematic review, lolz!

Noah: There are meta-analyses in the works for sure, maybe even published by now?

Darren: Also…a lot of the registered Chinese trials have stopped recruiting, for good reason of course: too few patients to recruit. So they had what, a 2 or 3 month window.

Andrew: Funny, when the pandemic dies out in your part of the world, no more cases means a trial that doesn’t finish…which brings us to a potentially interesting tangent: I have seen some folks on Twitter argue that COVID trials need to be finished in <12 months to mean anything because of the likelihood that we’ll have a vaccine in approximately that time frame.

Noah: Also worth noting that a huge proportion of the trials are from hospitals, not necessarily research centers.

Mollie: It seems like there’s a prevailing sense that we should be able to relax certain strict rules for trials because this is an emergency. But a lot of those rules, like the ethics of who is included/eligible, seem to be perceived as a lot more malleable than others.

Sarah: Relaxation of standards is the wrong move. An emergency is exactly the time for high standards. Sloppy studies can be more harmful than nothing.

Andrew: Building from this, an important question is which components of clinical research machinery that we perceive as “red tape” in normal circumstances can be meaningfully expedited without compromising standards. One good example, I think, would be whether we can have a more streamlined process for broad-spectrum approvals across the nation in a pandemic situation rather than requiring every individual investigator starting a trial to get their own regulatory approval. Or, making the Institutional Review Board process more efficient for multicenter trials.

Mollie: My gut reaction on the speed of review is “yikes.” After all this, I don’t think prioritizing IRB speed, absent an emergency, is necessary, and we shouldn’t expect it.

Darren: Anyone arguing to lower standards doesn’t deserve our trust. That’s why we have standards. Trustworthy studies, so we don’t have to trust people…

Andrew: Most trials that are led by academics take a lot of time. We have to apply for funding (often from the federal government or foundations), wait a couple months for the grant to get reviewed, then if it gets funded, we bring the team together and actually start building the infrastructure to do the thing. It takes months (if not years!) to get a study started during normal working conditions. For trials to start rapidly, a lot of the infrastructure has to be in place – existing networks of sites, databases ready to capture the relevant information, framework to enroll and randomize patients. When a trial has to be done in 3 months to mean something, that can only happen if a lot of the framework already exists or if barriers can be removed to allow things to happen very rapidly due to the pandemic.

Mollie: I think this leads into some important points about: who is included in trials (generally, and specifically for COVID), and who is not included. First: institutions with existing trials infrastructure will get up and running faster, and they’ll treat the patient pool they have (usually). This means wealthier institutions treating wealthier and often healthier patients.

Sarah: YES. There seems to be a mis-match between places who have the institutional support and places that have the patients.

Mollie: For example: last I looked, Massachusetts General Hospital and Brigham and Women’s (private, Harvard-affiliated) in Boston both had COVID trials, but Boston Medical Center (Boston’s public safety-net hospital) did not. Especially considering that this pandemic is hitting poor people, and black and brown communities much, much harder, this is a big deal.

Sarah: So are there instances from the global pandemic trial world where work was done to get the institutional support and the patients together?

Andrew: Fantastic points, Sarah & Mollie. It can be challenging for the so-called “community” hospitals (not affiliated with academic medical centers) to get involved in research studies. This comes back a bit to the structural/incentive problem: faculty in academic centers who commonly lead trials (who are generally kind-hearted people fighting to do good! Not blaming them for this) may not have connections with the community hospitals; or if they do, there are often hurdles to overcome to get them started as a trial site.

Mollie: These institutional structural inequities in trial location could exacerbate what are already alarming differences in outcome for different patient pools. I also want to note that as of right now, pregnant women will be ineligible for trials, including vaccine trials, unless specific exemptions are made. Same for prisoners, possibly same for other vulnerable groups.

Sarah: ^This is ethically concerning. In the world pre-COVID I thought we were making progress on explaining why efforts to avoid including pregnant women in trials actually contributed to harms to pregnant women because it resulted in a harmful lack of information about pregnant women.

Noah: Are we just moving too much, too fast?

Darren: We aren’t prepared to go this fast. And too many would rather be first than be correct.

Sarah: I think there is a really strong impulse that needs to be fought here. Everyone wants to do something. And not everyone can be the ones leadingTHE trial.

Mollie: Someone brought up the incentives in academia and how they might (ha) be damaging here. Darren?

Darren: What’s that quote? ~ It’s amazing what you can accomplish when nobody cares about who gets the credit.

Andrew: I think this comes back a bit to the overall incentive/structure for much of academic medicine. Ideally, for crisis-level situations that require a large, coordinated response, there would be an existing framework ready to implement one or more large-scale trials more or less “ready to be activated” when called upon. But that doesn’t exist;

Noah: But we didn’t have that set up.

Andrew: No. We all work at separate universities, apply for competitive grants, and as Sarah just pointed out, not everyone can be the ones leadingTHE big and glamorous trial. So instead of hundreds of people signing up to be part of 5 really big trials, we have 568 small trials.

Emily: We need coordinated plans for how to do trials in a pandemic, and we need these plans in place before the pandemic. WHO has done exactly this with the R&D blueprint established in 2016. It seems (from outside view) this has mostly been applied to the COVID vaccine trials. The COVID-19 Therapeutics Accelerator (funded/coordinated by The Gates Foundation, Wellcome Trust, and Mastercard) are playing a similar role in the therapeutics trials. All of this to say that I’m more optimistic than everyone else that we will get useful information faster than usual (although we’re still talking in terms of months, not weeks).

Darren: In the UK, they had REMAP-CAP up pretty fast.

Andrew: REMAP-CAP could be a whole chat of its own, but it’s important to discuss briefly here for “what can we learn about trials” I think. There is a large network of clinical trials units.

Noah: Yup. Things like REMAP-CAP have been proposed before, but maybe the idea behind it is important more now than ever. Give a go at explaining?

Andrew: It’s a bit complicated, but the idea is the entire trial is embedded in routine clinical care in a highly adaptable way. So all the data collection and treatment decisions are part of the clinical process, including the randomization of treatment decisions to get the benefits of RCTs.

Noah: The super interesting part to me is the adaptation. It’s not a “normal” trial; the randomization probabilities themself change as more data.

Andrew: Right. The randomization chances change to allocate patients to arms that appear to be performing better as the trial goes on. And it’s multifactorial, so it’s not just one decision (i.e. a drug), but lots of decisions all at once, so you get some “bang for your buck” instead of setting up a whole big trial system to answer just one question.

Emily: It’s also not just one kind of treatment, there are  several domains of treatment including antivirals, corticosteroid strategies, immune modulation strategies, and more being added.

Noah: It’s a type of trial that has been experimented with in the last few years, but is a totally different way to do a trial than is normally done. I would never have thought we’d see something like this set up so quickly.

Sarah: Sounds like this has benefits ethically in that it avoids research waste and has these critical thresholds for stopping randomization built in beforehand. It may be difficult to untangle obtaining informed consent for participation in such a trial, given so many possibilities for the treatment of each individual patient.

Noah: We are doing lots of treatment trials, but what do we not know right now that trials won’t be able to help us with?

Darren: Immunity…case projections…masks…social inequities…

Andrew: Trials will be able to help us with “Does hydroxychloroquine / azithromycin / remdesivir do any damn good at all” – just about anything else, trials probably won’t be able to help with. Do masks work in a community setting to reduce transmission? No trial likely to be done in a meaningful time for this pandemic. When can we go back to work? No trial gonna solve that one. How do we prevent this from hurting the most vulnerable populations? Trials aren’t going to help with that

Sarah: You all covered it- masks, timeline, inequalities. Some of the questions we most want answers to.

Noah: Alright, so the trials world is totally on its head, just like everything else. To close out, here’s a scenario. You have 5 minutes to speak to any group of people you want about trials as they relate to COVID. Could be anyone (general population, docs, patients, elected officials), but you have 5 minutes. Who do you pick, and what do you tell them?

Darren: I’ll take docs, because that’s what I try to do anyway. I would help them (those that need it, certainly not all) understand the limits of their own understanding of trial methods; to not to put stock in a trial just because it’s in Lancet or NEJM; and remind them of all the  common medical practices  that were later shown to not work in rigorous trials.

Sarah: I pick trialists. Can my magical five minutes be that we all get in on zoom and do priority setting across institutions?

Noah: Only if it has a cool space background

Sarah: I worry that without communication across institutions, we will focus on testing one treatment and have many competing under-powered inconclusive studies of that treatment. I worry that without communication across institutions, we will (as discussed earlier) see that the places with lots of sick patients will not be linked with places with lots of study expertise to work together, since these institutions tend not to be the same. 

Andrew: I’ll speak to elected officials; I will plead with them that this scenario is why we should have a standing National Pandemic Trials Unit that includes a large number of sites who can “opt in” to join the study with an expedited review process and framework of a database that can be activated quickly for a new study. A core of experienced trial investigators will be prepared to lead these trials, pulling in relevant content experts if necessary for a new outbreak. In non-pandemic times, the unit can be used to run trials of, well, other stuff that’s of national health interest.

Darren: Yeah, let’s do that!

Sarah: I donate my 5 minutes to Andrew.

Andrew: Sarah has a good point, of course, which is really what inspires my comment. Of course this would be better if we had a few dozen people hopping into bigger, better coordinated trials rather than 568 (mostly single center) trials

Mollie: Good thing no one is talking about withdrawing funding from WHO, then

Noah: Nope. Nobody would do something so stupid in the middle of a pandemic.

Mollie: I’m trying to decide between the general population and elected officials

Noah: So are the elected officials

Mollie: So I’ll take the teeming masses. And I guess I’d try to talk about how often, your doctor is good at picking patients who will respond to treatments, but aren’t good at (and aren’t trained to) decide whether treatments themselves work. And how trials are really the only way we can answer that last question.

Andrew: Since everyone is (justifiably) worried about the economic impacts of a prolonged shutdown for social distancing purposes, you can argue that the ROI on these things would be there if doing effective trials quicker = less need for broad mitigation strategies if we can discover effective therapies faster

Noah: Bring it home, Emily

Emily: I’ll also opt for elected officials. And if I have to get specific, I’ll go for Trump and his advisors. When elected officials advocate for unproven interventions, people get hurt. And we’re seeing that on a major scale here during the COVID-19 pandemic. For example, there was a run on tylenol/paracetomol after a French health minister declared ibuprofen might be unsafe, and there have been marked increases in chloroquine/hydroxychloroqine prescriptions since officials advocated for this. People have become sick and even died from treatments advocated for without evidence from our top officials. All of this is to say that it’s extremely important for public figures to understand that if we’re doing a clinical trial – it’s because we truly don’t know whether the treatment will work or not. It’s never a good idea to bet on the outcome of a trial, and it’s certainly a bad idea to make public health recommendations before the data comes in.

Noah: And that’s a wrap! Thanks for sticking around, see y’all next time!

Mollie Wood is a postdoctoral researcher in the Department of Epidemiology at the Harvard School of Public Health. She specializes in reproductive and perinatal pharmacoepidemiology, with methods interests in measurement error, mediation, and family-based study designs. Tweets @Anecdatally

Sarah Wieten is the Clinical Ethics Fellow at Stanford Health Care and a postdoctoral researcher at the Stanford Center for Biomedical Ethics.  She specializes in interdisciplinary projects at the intersection of epistemology and ethics in health care. Tweets @SarahWieten.

Emily Smith an epidemiologist and an Assistant Professor of Global Health at The George Washington University, Milken Institute School of Public Health in Washington D.C. Her research focuses on clinical trials aimed at generating evidence for global public health practice and policy. Tweets @emily_ers

Andrew Althouse is an Assistant Professor at the University of Pittsburgh School of Medicine.  He works principally as a statistician on randomized controlled trials in medicine and health services research.  Tweets @ADAlthousePhD

Darren Dahly is the Principal Statistician of the HRB Clinical Research Facility Cork, and a Senior Lecturer in Research Methods at the University College Cork School of Public Health. He works as an applied statistician, contributing to both epidemiological studies and clinical trials, and is interested in better understanding how we can improve the nature of statistical collaboration across the health sciences. Tweets @statsepi.

Noah Haber is a postdoctoral researcher at the Meta-Research Innovation Center at Stanford University (METRICS), specializing in meta-science from study generation to social media, causal inference econometrics, and applied statistical work in HIV/AIDS in South Africa. He is the lead author of CLAIMS and XvY and blogs here at Tweets @NoahHaber.

Thoughts and comments welcome