Chat: Should we placebo control in late phase clinical trials?

Welcome to the first of a series of chat posts, in which we get a handful of experts together to chat about important issues in the world of health science and statistics. We’ll be hosting these on a regular basis for the foreseeable future. First up:

Should we placebo control in late phase clinical trials?

<Editor’s note: The transcript below is lightly edited from the original for clarity. Additional edits noted at bottom of page>

Noah Haber (postdoc in HIV/AIDS, causal inference econometrics, meta-science): Let’s kick off with an unfair, possibly meaningless question: In an ideal world, in a frictionless plane in a vacuum, on a scale of 0 (never placebo) to 100 (always placebo), should late phase clinical trials control the treatment of interest against a placebo?
Everyone must give a number!

Mollie Wood (postdoc in repro/perinatal pharmacoepi): Can we assume participants are spheres?

Noah: Spheres of infinitely small size

Andrew Althouse (professor at University of Pittsburgh, clinical trial statistician): I’ll toss 80 out there.

Boback Ziaeian (professor at UCLA in the Division of Cardiology): 90

Emily R Smith (research at Gates Foundation and HSPH in global health, MNCH, and nutrition): 90 (Assuming a placebo is ethical / appropriate)

Mollie: also, 80

Noah: Looks like I get to play devil’s advocate today!

Andrew: Fascinating. Round number bias. I’ll revise my answer to 77.3

I think “assuming a placebo is ethical” is a given.

Mollie: I have a quick possible-digression about this, prompted from a twitter thread the other day.
Are we too quick to say “setting ethics aside” or similar?

Noah: Probably, but for now, let’s set ethics aside 🙂

Mollie: OK, but I’m gonna harp on this later

Noah: Noted, harping will occur. We need to do some defining before we go much further. Who wants to give a definition of what “late phase clinical trial” actually means?

Emily: I equate late phase trials to Phase III / IV clinical trials. The Phase III/IV delineation is commonly used in the drug development and regulatory space. It indicates something about the size of the study and the purpose of the study. These later phase trials are larger (e.g. 300 to 3,000 people perhaps) and are meant to show efficacy and monitor potential adverse outcomes.

Boback: For prescription or device therapy it’s typically a phase III trial meant to evaluate efficacy or “non-inferiority.”

Emily: To put it simply, it is a trial to demonstrate whether a new ‘treatment’ is as good as or better than the existing standard of care.

Andrew: A layperson’s definition, maybe, is “the trial that would be strong enough evidence to make people start using the drug if positive.” I’ll go with “Phase III / pivotal trial that would grant drug approval” for pharmaceuticals in development / not yet FDA approved.

Mollie: Maybe also good to note that the trial could be for a new indication for an existing approved drug or device- right?

Emily: Good point

Andrew: Agree, Mollie.

Noah: What’s the usual logic behind placebo controls?

Boback: Control arm by default is always usual care +/- placebo or sham control.

Emily: For now, this isn’t a debate about whether to use a placebo or other control. This assumes the standard of care is currently nothing or something without evidence.

Why do we use placebos? There are so many good reasons!

Noah: Gimme a list!

Boback: Participants and study administrators are blinded to treatment arms.

Andrew: Generally, avoidance/minimization of bias (in assessors and participants). So participants report their symptoms honestly without knowledge of their treatment assignment. And likewise, assessors treat the patients the same / assess the outcome the same way. Rather than letting their knowledge of what the patient is getting influence them in some way

Boback: Allow for equal Hawthorne effects in both arms. The Hawthorne effect is the realization that by studying people in a research setting their behavior may naturally shift. Perhaps they become more adherent and avoid toxic habits like smoking etc. with study participation.

Emily: And, we want to avoid the ‘placebo effect’. We are all susceptible to feeling better when we are given something. For example, your toddler feels better when you kiss their ouchie.

Andrew: Boback, don’t use such big words. This is a chat for the people

Mollie: And disease course changes over time- we want to know if it changes more for active treatment than control. Or less! sometimes treatment halts progression

Noah: Imma start off my devil’s advocating strong Andrew Althouse on the “bias” part. Or at least get more specific. Bias against what?

Mollie:  Here’s one: if you’re a scientist and you’ve started a drug trial, you probably believe your drug works

Emily: The concern about assessor or particpant bias is a major concern when the outcomes of interest rely heavily on self perception. For example, is your pain better or use. Can you concentrate at school more easily? Has your child’s motor or language skills improved? I personally feel less concerned about assessor bias when the outcome of interest is static/objective/easy to measure. For example, mortality is a clear study endpoint. It’s harder to imagine bias creeping into this assessment.

Boback: The consent process for a study typically requires full disclosure of what the study is designed for. “We are evaluating fish oil to see if you don’t develop coronary artery disease. If you consent, we will randomize you to treatment or placebo for the next 5 years.” If you consent someone and say “sorry you didn’t get the drug, we are going to just give you nothing and check in on your every 6 months.” The participant may say forget you, I’ll buy over the counter fish oil myself. Or they may feel so depressed that their stress hormones go up and they eat more Cheetos that increases their coronary artery disease risk.

Noah: Boom, let’s use Boback’s example. Do love me some Cheetos.

So, let’s give a scenario. I am treating a patient (which should NEVER HAPPEN) for coronary artery disease. I have heard of such a thing as a “placebo effect.” Now I decide what treatment to give you. Drug A or ______. Where _____ is almost never placebo.

Andrew: You can play doctor in this chat, Noah. I am pretty pro-placebo-controlled-trial, but I see where Noah’s going with this.

Noah: My job is, in general, to pick the thing that is most likely to make my patient better. If the clinical trial was randomized, placebo controlled, and double blinded. That doesn’t look AT ALL what my clinical decision is like. Because the patient’s response in the real world INCLUDES all of those things we just got rid of. Correct?

Emily: That’s true – clinical trials look very different than clinical practice. And that’s the point! We didn’t eliminate the real world in a clinical trial. We made sure the ‘real word’ things happening aren’t the causes of your good/bad health.

Boback: The trial isn’t meant to mimic reality, it’s meant to neutralize confounding factors and estimate a direct treatment effect.

Andrew: Right. Your point is, the real world is either “We’re going to start you on this drug” or “Have a good year. We’ll see you next year”

Mollie: Hopefully, we eliminated the clinician saying “patient A is really sick, he’d better get active treatment.” Patient B is doing great, let’s give him the placebo. Wait, the drug is killing people, what happened?

Andrew: So, Boback is opening the “efficacy or effectiveness” door

Emily: An efficacy clinical trial is meant to know if a treatment works or not (in the ideal setting). In contrast, an effectiveness trial is meant to know if the treatment will work in the real world context.

Andrew: Difference between “try to figure out if this drug is biologically active” and “will adding this drug to clinical practice be a net benefit / net harm”

Boback: Physicians and hopefully patients want to know treatment effects of therapies. “What happens to this patient in front of me if I treat them with X?”

Emily: So here we’re talking about ‘late phase’ clinical trials – efficacy trials. Where we are trying to learn if “the drug is biologically active”

Andrew: In a placebo-controlled trial, we avoid the bias introduced by participant/assessor knowledge of what the patient is getting, and get a good estimate of the “true biological effect” of the drug. But Noah’s point is that people acting differentially based on knowledge of whether they’re getting a drug or not is going to be part of what happens in the real world

Noah: Exactly. And all we care about in the end is what happens to our patients

Andrew: The question, then, is for a late-phase trial where the “fate” of a new drug hangs in the balance, which estimate do we care about more

Emily: But we can’t know if it’s a good idea (e.g. efficacious, safe) to proceed to the ‘real world’ until we have some evidence!

Andrew: Right, that’s why I have trouble fully embracing Noah’s idea

Noah: Isn’t that evidence enough? If it doesn’t work because of some effect of knowing what treatment you are on, won’t that happen to all patients too?

Emily: Noah, it’s not that it doesn’t work because you know you’re on treatment. It’s that you might ‘feel better’ thinking you got a treatment

Mollie: Is this still a frictionless plane?

Noah: Friction mode restored.

Mollie: Well, treatments have costs. Trying treatment a means you didn’t try treatment B. and approving treatments that work because of positive thinking means resources are going to those treatments that would be better spent on others

Emily: Mollie brings up a new point here – we have to allocate resources.Doctors have to make choices. Health systems have to provide supplies. How do we make those choices?

Mollie: I don’t want to go into a whole cost-effectiveness Thing here, I just wanted to point out that introducing a drug into the formulary that genuinely does nothing is not a neutral act– it means that patients who might have benefited from a different drug instead go untreated, and finite resources are wasted.

Noah: Knowing you got a treatment is free. If it makes a big improvement on outcomes, aren’t we losing something if we don’t include knowing you got treatment in the cost-effectiveness analysis?

Emily: Everyone in a placebo-controlled trials thinks they might have gotten the treatment!

Boback: We want to estimate the treatment effect of the intervention without the belief in the intervention. Most trials probably bias towards the null based on the intention to treat principle. We quantify treatment effects based on what group a person was randomized to not whether they adhered to treatment. For almost all trials, patients are not fully adherent or drop out at some rate which makes estimating the treatment effect not exactly what we hoped to measure. The beneficial effects if measured are underestimated.

The whole concern with all the randomized trials of stenting for angina were that the control arms was never blinded to the procedure until the Orbita trial was published last year. Prior to Orbita, trials compared invasive angiograms to medical therapy and claimed stenting improved symptoms more. However, Orbita did angiograms on everyone and patients did not know whether they had a stent for 30-days. At which point they were unblinded, no significant difference was noted in exercise time or other primary endpoints.

Same with sham arthroscopy of the knee. Patients post-procedure naturally get better and the treatment effects were confounded by the act of receiving a procedure itself.

Andrew: I think we need Noah to go into more detail about the direction that he believes this works. Because we’re all pretty clearly grounded in the idea that using a placebo is meant to wash out the “knowledge you got a treatment makes you feel better” effect.

Noah: Ok, let me lay out the case. The central idea is that we SHOULDN’T wash out the “knowledge you got a treatment makes you feel better” effect. Because that effect is part of (maybe a HUGE part of) the total effect the clinician and patient faces when they decide to treat or not to treat. So not including that is it’s own kind of bias, in the context of the treatment decision.

Emily: This can be captured AFTER we know whether there is a meaningful biological effect of the treatment itself. For example. If you ‘feel better’ after consuming arsenic / rat poison — should we give it to everyone? NO! It’s dangerous!

Noah: I don’t think I want to be in that trial

Emily: Me either, and this is why we need to carefully prove there is a a ‘real’ biological effect of a treatment by using a placebo/control

Andrew: Using Boback’s example: if sham-controlled trials show that knee arthroscopy’s benefit is entirely explained by “knowledge that you got a procedure” then why bother doing any knee arthroscopes? Just do sham procedures and save everyone the trouble. Send them to the OR, have everyone stand around looking serious, give them a local anesthetic, stand there awhile, tell them the procedure went well and we’re good. Same thing could be applied to drug trials – if the drug can’t outperform a sugar-pill as placebo (even if part of the benefit is “belief that they’re getting a new drug makes them feel better”) why bother with the expensive drug? Just give them Placebonium.

Emily: Well said

Boback: So Noah are you saying you want to include the “placebo effect” as a benefit for a prescribed treatment for a patient?

Noah: I’ll say yes, or at least there’s an argument to be made for it.

Emily’s point about separation is important. If we can know both, separately, we obviously should. Practically, do we ever have the time and money to these giant, expensive trials both ways (with a placebo control and with a “do nothing control”?

Boback: Well, than it’s probably just worth quantifying the placebo effect. Compare placebo to no treatment and active treatment.

Noah: The “placebo effect” changes for every treatment. If we can only do it one way, shouldn’t we do it the way most relevant for the clinical decision?

Boback: There’s plenty issues with why randomized trials are expensive and time consuming to perform. I do not think the placebo issue is the main problem. There’s more to be said for using our statistical understanding better and building a pragmatic trial infrastructure.

Emily: Maybe we can talk about other reasons you wouldn’t want to use a placebo? Andrew and Mollie said 80%. What’s happening in the other 20%?

Noah: I am the 20%.

Andrew: Re: when a placebo control wouldn’t be needed: I think basically any Phase III (drug approval) study should be placebo controlled (i.e. either there is no accepted therapy for the condition so we’re truly talking therapy vs. nothing, or if there’s an accepted therapy, the patient is blinded so they don’t know if they’re getting standard-of-care or new-experimental-drug, which may or may not require a “placebo” to achieve said blinding depending on what SOC is). I’m a *little* more ambivalent for studies of drugs that are already approved or in use.

Boback: The other point is that placebos are probably not necessary for “hard endpoints.” For all-cause mortality, it’s hard to imagine that getting placebo vs. not getting it would influence your risk of dying.

Andrew: But I guess that depends if we’re including real-world, head-2-head CER trials of existing things as “late stage trials”

Emily: I tend to agree Boback. My only caveat — in my line of work — there aren’t vital registration programs and we rely on research staff to find ‘missing’ patients. So some people still worry about staff/assessor bias. For example – this child got treatment, so maybe they didn’t come to clinic because they are on vacation. Whereas, this child got no treatment, so maybe they didn’t come to clinic because they’ve passed away – I will go visit the household to find out.

Mollie: Yeah, I think my 80% comes from my bias as a postmarketing surveillance researcher- none of the treatments I deal with are pre-approval

Noah: What’s different about postmarketing surveillance?

Mollie: So I work almost entirely on drug safety in pregnancy, and most of the time, the relevant clinical question is “should this woman planning to get pregnant discontinue her methadone or switch to buprenorphine?” or similar.

This is maybe a little too detailed, but there have been SO MANY studies of antidepressant safety in pregnancy, some showing harm, some not, that I am almost ready to say you could ethically do a placebo controlled trial to get the right answer, but man, clinicians do not seem to agree.

Emily: Mollie this is a great point. It can be quite controversial as to whether or not there is equipoise (genuine uncertainty about which treatment is better) for a placebo.

Mollie: Equipoise is hard! You have to be truly unsure about the possible benefit or risk of the drug.

Mollie: I think in the postmarketing space, equipoise for placebo control is almost never really there.

Andrew: Mollie’s last comment brings up an interesting example, which makes me wonder if we’re just talking about “placebo” or more broadly about “blinding.” I’ve seen some trials that were h2h CER trials of 2 active drugs that didn’t look like one another where both arms had to take their active drug and a placebo that LOOKED like the other one to achieve full blinding. So is our issue just about using “placebo controls” where the choice is “something versus nothing” or is it a broader debate about making sure the participant/assessor isn’t sure who’s getting what (even in setting of 2 active agents?) i.e. if one active drug dosed once/daily and the other is dosed twice/daily, each arm had to take a “placebo” on the schedule of the other drug so they wouldn’t know which drug they were on

Boback: Placebo is very frequently broken in trials. If you are getting a cholesterol lowering drug, it’s hard not realize your lipid values are dropping on follow-up testing.

Emily: If this is the case, then I move my original estimate to say that I think that trials should be blinded 95%+ of time!

Noah: Wait! I go the other way!

Emily: You do?!

Noah: If blinding/placebo is going to be broken anyway, why are we doing it in the first place?

Andrew: It sounds like the world Noah describes is that these trials shouldn’t be blinded.  

Boback: I’ve always wished trials routinely reported at the end of trials what portion of participants believed they received active treatment.

Mollie: It would be nice if it were routine

Andrew: But that brings up something Boback said at the very beginning. If you’re in a high-mortality space (cancer) and the patient isn’t blinded, they’re probably walking out of the trial immediately.

Emily: Yes, agree with Andrew and Boback – sometimes including a placebo means that you can’t recruit a representative sample.

Boback: But it’s always interesting to see all the side effects in placebo arms.

Andrew: In theory, patients shouldn’t be enrolling in RCT to get access to experimental treatments, but they’re still probably not hanging around that trial once they’re assigned & told they’re in the placebo arm.

Boback: If someone is going through the trouble of running an RCT, there’s probably a need or a large market for what they are proposing.

Emily: My #1 practical reason for including blinding/placebo/control is that it’s the hallmark of high quality evidence. And in order for evidence to make it through regulatory / policy processes – it needs to be high quality! And why are generating evidence if not to change policy and practice?! Thus, I vote placebo for president.

Mollie: I’d prioritize randomizing and blinding over placebo. Unless it’s a totally new drug to treat a disease with no current treatment

Emily: Agree with Mollie. If we’re in the real world – then placebo likely not appropriate in many cases. For ethical reasons. (A placebo isn’t ethical when there is an existing treatment or practice that is either recommended (by governing/regulatory bodies) or is commonly practices by physicians)

Andrew: Right, a “placebo” is kind of inextricable from blinding. If a placebo isn’t needed for “blinding” then fine – no placebo. But even in some trials with 2 active agents the placebo is needed to preserve the blind. (the earlier example of one drug given 1x daily vs another given 2x daily)

Mollie: Yeah, no argument there

Andrew: So a placebo’s main function is a means to preserve blinding

Noah: And to my point earlier, my argument is really more generally against blinding, by way of placebo controls

Emily: To preserve blinding AND to account for placebo effect. (I think they are two separate points?)

Andrew: Right, Noah just thinks that blinding is problematic because it doesn’t = “real world treatment effect”

Noah: Yup, and in the end, the “why” doesn’t matter so much as what you get in the end

Mollie: Noah, are you happy with a pre/post measurement on just the treated group?

Andrew: (shrieks in horror)

KILL IT WITH FIRE

Mollie: Nuke it from orbit. It’s the only way to be sure.

Mollie: But seriously, that’s the effect a doc sees when they treat a patient, right? In the real world.

Mollie: …historical controls?

You’re scaring me, man.

Andrew: I reluctantly admit that I’m giving more thought to historical controls as a viable option in some situations, though the statistician in me still hates it

Noah: HA

Emily: I am also thinking a lot about historical controls!

Noah: This is probably why I’m not a real doctor, which we can all be thankful for. But to clarify, my version is to control against “don’t treat at all,” which, going back to earlier point, would be tough to recruit with as an option. I mean real current trial controls, where half the people just don’t get treated. But sometimes historical controls can be useful. . .

Mollie: Nooooooo, guys

Emily: In the context of adaptive trial design … it starts to make some sense.

Andrew: Specifically, very-high-mortality where there is no known effective treatment option, with novel device/drug as the only real option available. I might be able to stomach comparison against historical controls. This is a bit off topic, perhaps can be picked up in a future chat

Noah: Let’s switch topics a bit, and talk about the research science meta-verse

Mollie: My favorite meta-verse!

Noah: Then let’s get super meta. If all of our past research now had been with placebo / blinded controls vs none of it was. What might we know more or less of now? Would we understand more about biological mechanisms?

Andrew: if we just replaced all placebo controlled trials that have been done with Noah Haber style trials?

Noah: Exactly. What would we gain/lose. Except lose, because Noah style trials are perfect <editor’s note: I regret tacitly agreeing that this should be called Noah Haber style”>

Emily: There would be even more molecules approved for treatment of depression. (This is an area of research known to be especially sensitive to the placebo effect).

Mollie: I assume we’d be using Zicam for cancer treatment

Noah: But only if believing in zicam (to do anything at all) had an actual clinical effect, right?

Andrew: I mean, we’d certainly know less about true biological effects

Noah Haber: True. So, quick recap: the main argument against placebo controls is super tied in with the idea that we also shouldn’t blind because we are sufficiently far from real world conditions (which include placebo effects) that our measured effects aren’t realistic. HOWEVER:

Andrew: with placebo control (and blinding), we can be *reasonably* certain that the effect observed in the trial is actually an effect of the drug being tested and not simply the effect of feeling like you get something better. without placebo control (i.e. trial is “drug versus nothing”) in theory you may get a better estimate of the “real world” effect since that’s what the “real world” will be (drug or…whatever else) but you run the risks of patient dropout and other behaviors influencing their outcomes.

Mollie: I see two major ethical concerns. First, approving treatments that don’t “work” (beyond placebo effect) takes away opportunities for patients to be treated with drugs that DO work. Second, resources are finite and we shouldn’t be spending limited funds on ineffective treatments. Removing placebo controls risks violations of both of these. (Third, I do not think we need to help pharma companies any more than we already do). (edited)

Boback: The only proper way to measure treatment effects is to keep patients and clinicians in the setting of an RCT blinded to the intervention to avoid introducing confounding factors. RCTs are meant to measure treatment effects. Randomization is our best tool for breaking confounding and for most endpoints, blinding is required to preserve patient and clinician behavior. It also allows for objective measures of adverse events/side effects.

Noah: Great, ok, last thing! In an ideal world, in a frictionless plane in a vacuum, on a scale of 0 (never placebo) to 100 (always placebo), should late phase clinical trials control the treatment of interest against a placebo?

Boback: 90

Noah: 45

Andrew: I’ll stick with my 80. Majority of the time. But there may be some settings where I could be convinced that a non-placebo-controlled design is appropriate

Mollie: In this frictionless plane, are there other treatments available?

Noah: Yes, but also frictionless.

Mollie: Then I’m sticking with my 80

Emily: (I”m still unsure if placebo means control here) But I’m increasing to 96% assuming we’re talking placebo/control!

Noah: Ha. I’ll call that a day! Thanks y’all!

Emily: Thanks, friends!

Mollie: Thanks everyone!

Andrew: Thanks everyone. this was fun, good to kick things around with other smart people.

Boback: Thanks for including a manatee.

Boback Ziaeian is an Assistant Professor at UCLA and the VA Greater Los Angeles in the Division of Cardiology. As an outcomes/health services researcher and cardiologist his primary interest is improving the receipt of high-value care and reducing disparities for cardiovascular patients. Tweets @boback.

Andrew Althouse is an Assistant Professor at the University of Pittsburgh School of Medicine.  He works principally as a statistician on randomized controlled trials in medicine and health services research.  Tweets @ADAlthousePhD

Mollie Wood is a postdoctoral researcher in the Department of Epidemiology at the Harvard School of Public Health. She specializes in reproductive and perinatal pharmacoepidemiology, with methods interests in measurement error, mediation, and family-based study designs.  Tweets @anecdatally

Emily R. Smith is a Program Officer at the Bill & Melinda Gates Foundation and a Research Associate at the Harvard School of Public Health. Her research focuses on the design and conduct of clinical trials to improve maternal and child health globally. Tweets @emily_ers

Noah Haber is a postdoc at UNC, specializing in meta-science from study generation to social media, causal inference econometrics, and applied statistical work in HIV/AIDS in South Africa. He is the lead author of CLAIMS and XvY, blogs here at MetaCausal.com, and tweets @NoahHaber.

Edit notes: Made an edit to change “late stage clinical trial” to “late phase clinical trial” to clarify that this was not specific to late-stage cancer trials.

Thoughts and comments welcome