Contra SSC on Aduhelm
Even though I’m arguing against Scott’s recent post, I want to preface this by saying I don’t really disagree with his broad conclusion all that much, which is that if you could only increase or decrease the strictness of the FDA, broadly defined, you should probably decrease the strictness. However, I think the story is complicated and if you want the best reforms, you need a better sense of where the FDA has gone right and wrong in the past. Also, I completely agree that regarding COVID-19, the FDA has been far too slow on approving testing, vaccines, etc.
Tl;DR: Aduhelm sucks, biomarkers often suck, approving/paying based on biomarkers creates bad incentives, confirmatory trials often don’t take place, skeptical tiered healthcare will remain tiered in the long-run
Aduhelm almost certainly doesn’t work
Scott makes the argument that the much-criticized Aduhelm approval is not quite as bad as people say and the “FDA is too strict” argument, summarized aptly in many of Alex Tabarrok’s posts on the FDA, is still strong. I disagree strongly with the first point and have some quibbles with the second point.
Though he acknowledges that Aduhelm is likely a bad drug, I worry he understates this, and I want to give a rundown of the reasons the Aduhelm approval is a bad idea. Note: I’m not exactly sure what Scott’s belief on Aduhelm working precisely is, since he says “it’s pretty unclear whether it actually treats Alzheimer’s… bad drug that won’t work”. So I might be arguing against a strawman. Apologies if that’s the case.
Timeline of Aduhlem approval
The two Phase III studies that Biogen undertook to get Aduhelm approved had conflicting results. Initially they were stopped early because of futility, and then Biogen reanalyzed the data and found some statistically significant but relatively modest clinical benefit in one trial and no statistically significant benefit in the second trial. Already, this should set off alarm bells for p-hacking…
Biogen approached the FDA with these results and the FDA Advisory Committee, composed of respected subject-matter experts that don’t work for the FDA, which issues nonbinding decisions, rejected the drug. At that time Biogen and the FDA told that advisory committee (AdCom) that they were still seeking a regular approval, which effectively means you need two positive Phase 3 trials that show a clinical (not just a biomarker) benefit.
In some murky circumstances involving very close collaboration with the FDA reviewer of the drug Biogen decided to instead go for an Accelerated Approval pathway, without telling the Advisory Committee beforehand or giving them a chance to vote on that issue.
Accelerated Approval effectively means that you just need to demonstrate a benefit in a surrogate endpoint, which means some biomarker that predicts clinical benefit, after which the drug is approved, and then confirm a clinical benefit with follow-up studies. If your confirmatory studies don’t find a benefit, your drug’s approval is revoked, and if you take too long to run a confirmatory trial, approval is also (theoretically, though not always in practice) revoked.
So the argument Biogen is making with Aduhelm, which the FDA presumably agreed with, is: “reduction of amyloid plaques is a good surrogate endpoint.” That is, they’re saying “reducing amyloid plaques will probably cause a clinical benefit”.
The problem with that argument is that there is strong evidence against that claim. Not just weak evidence for it, but in fact, strong evidence against that claim. How?
Aduhelm is merely the latest in a long line of amyloid-directed therapies that have flatly failed in clinical studies: from this helpful reddit comment: semagacestat, bapineuzumab, solanezumab, gantenerumab, crenezumab, verubecestat, lanabecestat, atabecestat, umibecestat, & elenbecestat.
Several of these failed drugs successfully reduced amyloid—but did not slow cognitive decline. So in addition to the high skepticism we should have for a drug that fails 2 Phase III trials, we should recall that amyloid-directed therapies have done nothing but fail for something like 20+ years, and be extra skeptical of Aduhelm.
In conclusion, Aduhelm almost certainly won’t work. I’m still working out how much $$ I’ll put on this and how the bet would resolve, but if you give me a week or two, I’ll place 1:5 odds with the first 3 commentators on this drug not showing a clinical benefit if confirmatory trials ever get run.
To me, the biggest downside of the Aduhelm approval is the absolutely perverse set of incentives it creates. It will resurrect the amyloid hypothesis for other drug companies, as this story demonstrates. It will drain attention from more promising drug targets, like Tau tangles or aging itself that have not been conclusively rejected in multiple Phase III trials. The slow-moving ship of scientific consensus, which had been painfully turning away from amyloid, is now being pushed back towards it by an approval that will waste enormous amounts of $$ and researcher talent.
AIDS and the FDA
A rhetorically important part of Scott’s argument against a stricter FDA is the idea that the loosening of FDA standards for AIDS drugs was, on net, a great idea. This story is murkier than Scott (or the Atlantic article) portrays. I don’t blame him for telling the standard story on this, because the only place I’ve found the full story was Daniel Carpenter’s “Reputation and Power”1 and this post from Treatment Action Group, an advocacy group of AIDS activists who pushed for stricter drug approval standards. Other sources elide over the ambiguous results.
My previous post on Reputation and Power has a longer account, but I’ll just summarize. Basically, back when AIDS was a death sentence and there were no effective treatments for it, AIDS activists successfully lobbied the FDA to publicly loosen drug approval standards in several ways. Even before this lobbying occurred, however, the FDA (specifically a rising star there, Eileen Cooper) had modified the trial procedures for AZT, the first somewhat effective drug for AIDS, as follows:
They allowed 1 confirmatory trial, instead of requiring 2
They moved quite quickly on review—they approved the IND in a week.
The trial was halted early due to clear efficacy, and as soon as that took place, and before FDA approval officially finished, the drug was released under Compassionate Use to 4000+ AIDS patients.
More pressure from AIDS activist groups like ACT-UP and their political supporters followed that approval, and the FDA eventually allowed for surrogate approval of some future AIDS drugs based on improvements in CD4+ cell counts through the Accelerated Approval pathway, which it formalized in 1992. This pathway would only require surrogate endpoints (biomarkers) to get approval, but required eventual confirmatory studies.
This pathway was used to approve multi-drug treatment of AIDS (aZT + X-drug) and ddC and d4T were approved under that pathway.
ddL had been approved through this method earlier and a confirmatory study had shown a modest clinical benefit. All was going well.
However, ddC came along. It got approved for combination therapy with other HIV drugs because of improvements in a surrogate marker. Roche, its sponsor, never carried out the promised confirmatory trial, and later independent confirmatory studies showed the combination was no better clinically than monotherapy with either agent alone.
A group of HIV activists got upset by this turn of events, formed a group called Treatment Action Group (TAG), and tried to pressure the FDA and drug companies into following through on their commitments. Their argument was that aggressive Expanded Access programs and even approval based on surrogate endpoints was fine, as long as the promised confirmatory studies, which, to remind the reader, measure what we actually care about, would get done in a timely manner by the drug sponsor.
What was their reasoning?
They realized that surrogate endpoints meant very little unless there was also a clinical benefit. To make this concrete: the CTEP inhibitors increase HDL cholesterol (“good” cholesterol) and decrease LDL cholesterol. They have great biomarker benefits! But Phase III studies that looked at clinical outcomes found they either increased mortality or had no benefit. It doesn’t matter if all the biomarkers are going in the right direction—you need to show a benefit in meaningful outcomes! Niacin is another example of this, as Attia and Dayspring cover.
TAG’s open letter to the FDA and drug companies does a good job explaining their reasoning.
The result of that letter was that drug companies agreed to conduct more traditional trials, which generated much clearer clinical data—does adding this drug actually improve survival or just improve CD4 counts? Importantly, these clear trials coexisted with relatively generous Expanded Access programs, so that people ineligible for the trials could receive the drug outside of the trial as well.
I’m not sure what the current consensus on CD4+ cell counts as a surrogate endpoint for HIV drug trials is— but it is worth noting that having multiple trials that measure both surrogate and clinical endpoints should actually make us more willing to use that surrogate endpoint in the future. Why? Because at that point we’ve confirmed that the specific surrogate endpoint actually works. So if it turns out we use CD4+ cell counts as a biomarker for approval nowadays, I wouldn’t be surprised, because we’ve got multiple trials showing it correlates quite well with a clinical endpoint.
How much faster is accelerated approval?
A recurring argument of the “we need a stricter FDA” crowd, BTW, is that Accelerated Approval (AA), in practice, doesn’t speed up approvals by that much, at least in cancer. In the discussed article, the point estimate is 11 or 19 months out of ~7 years saved for cancer drugs, depending on the surrogate endpoint used. This speed-up comes with prolonged clinical uncertainty, as after approval, we still need to wait for confirmatory trials to see if a drug really works. I haven’t seen estimates of how much longer it takes to figure out if drugs approved through AA actually work, but a few years seems like a reasonable guess.
In addition, in practice, confirmatory trials are often done very slowly or not at all, and the FDA is quite slow to take drugs off the market once they’ve approved them, even if the company never carries out its confirmatory commitments. Malignant is a book-length argument on this topic, focusing specifically on cancer drugs, and contains lots of interesting arguments besides.
If, in practice, Accelerated Approval means not just quicker access to drugs but also slower access to definite answers on clinical questions and some chance of no definite answer, the cost-benefit gets murkier.
Agree on COVID-19
The best part of Scott’s argument is this:
Here’s another good example: coronavirus vaccines. The FDA still has not fully approved any coronavirus vaccine. The only reason you’re allowed to get vaccinated at all is because of a fast-track provisional approval somewhat like the one used for aducanumab. Coronavirus vaccines have probably also averted a few hundred thousand deaths.
So without wanting to say this level of success is “the norm” in the sense that every single fast-tracked drug achieves it, it’s not exactly vanishingly rare. It’s just something that happens sometimes and doesn’t happen sometimes. So how often do you have to save hundreds of thousands of lives before it’s worth the risk of occasionally also permitting a dud medication that “offers false hope”? How is this even a question?
This is a very good all-purpose argument and I mostly agree. Even conservative assumptions on a faster COVID-19 vaccine approval yields many lives saved and likely a quicker economic recovery that’s worth many bad drug approvals.
Still, COVID-19 vaccines were approved based on clinical endpoints (prevention of severe/symptomatic infection), not biomarkers, and this could be done much faster without using uncertain biomarkers—through human challenge trials, for instance. If we had used challenge trials, I don’t think using biomarkers would have sped things up by much: if we had to wait an extra two weeks for volunteers to get sick or not, as opposed to immediately measuring their antibody levels and going off that, that doesn’t mean much if manufacturing is still scaling up a few weeks later.
Accelerated Approval creates bad incentives
I want to add a wrinkle to the cost side of the equation that I think is underappreciated: the bad incentives that the availability of Accelerated Approval creates. The bad incentive is that instead of drug companies only getting to make $$ after designing a drug with a clinical benefit, they now get to make $$ after making a drug with just a biomarker benefit.
To be fair, this incentive is really the fault of the bundling of FDA approval with payment, and so Scott’s unbundling proposal (quite similar to the Niskanen Center proposal by Briggeman and Gulfo) would go some ways to fix it. Cost-benefit analyses that properly price in the uncertainty of drugs approved through accelerated approval would also help.
Still, to the degree that making drugs with just biomarker benefits is easier than a drug with clinical benefits and we pay for most approved drugs, this incentivizes the development of drugs that are useless. I don’t know how to quantify this but over the long-run, this seems like a big deal, and hard to capture in a cost-benefit analysis that doesn’t account for long-run changes in Pharma R&D strategy.
Nobody likes tiered healthcare
I think Scott underrates how controversial more visible “tiers” in healthcare will be. Here’s a case study: The UK NICE is a marvel of unemotional technocracy—it calculates the cost-benefit of different drugs and the UK healthcare system sets limits on how much $$ they’re willing to pay per QALY. Patients can then pay for less cost-effective treatment on their own if they want.
This has led them to reject a bunch of super expensive and low-value cancer drugs, which is great news for the UK taxpayer. But in 2009 and again in 2011, the NHS effectively loosened those cost-effectiveness guidelines. I’m not sure why that happened but a reasonable inference is that it may not be politically sustainable to have these kinds of headlines constantly.
So I’m somewhat skeptical that a more explicitly tiered US healthcare system is politically feasible. If we accept that as a possibility, the cost-benefit for a looser FDA + payment reform gets murkier still.
So I think Scott underrates how strong the pressure will be for all payers to cover even the most questionable drug.
Other FDA reform ideas
Some FDA reform ideas that I’m interested in: Instead of loosening the pipeline2 of drugs on the drug approval end, open it up on the front end, the IND (Investigational New Drug) end.
Make it easier to get drugs into clinical trials, but keep the standards high for fully approved (or in Scott’s payer reform world, Tier 4 and 5) FDA drugs high. I have some speculative arguments in favor of this I’ll expand on in a later post but they boil down to:
Animal studies for safety/efficacy have an imperfect relationship with human efficacy and may filter out drugs that are efficacious/safe in humans. Sarah Constantin puts it elegantly here:
and Gwern has written about this in more general terms.
Physicians randomly trying out drugs on patients is a pretty good way of discovering new uses for drugs, which this paper tries to explain in decision theory terms. How can we have more of that old-school serendipity? Of course, we’d want that in combination with incentives for good RCTs so that the medical evidence doesn’t stay at the level of “here we present a case series of 145 patients…”
Daniel Carpenter and Tyler Cowen had a great conversation on the FDA and COVID-19 a few weeks ago. IMO Tyler came out the victor in their debate over public confidence vs quicker vaccine access, but take a listen.
I owe this analogy to a career scientist who wants to remain anonymous
A correction on point 1 in your timeline:
Biogen's analysis, which showed a positive result in the first trial (EMERGE) in all of its primary and secondary cognitive endpoints, and a negative result in the second (ENGAGE), was done on the initial, pre-designated endpoints. It was only a "reanalysis" in the sense that it (rightly) used all of the data collected during the trial (through March 2019), as opposed to the futility analysis, which used data collected only up until December 2018, when that futility analysis began. Note that the futility analysis itself did not conclude that both trials were unlikely to reach positive results, merely that the second trial was (and as a result, both trials were terminated, since a positive result from both trials was the requirement for the standard FDA approval path).
Separate from this, Biogen *also* conducted a post-hoc analysis on a subgroup (participants enrolled after the fourth version of the trial protocol) which effectively received a higher dosage of the drug. In that post-hoc analysis, the reduction in the pace of cognitive decline was 30% (95% CI [1%, 60%]) in EMERGE and 27% (95% CI [-3%, 57%]) in ENGAGE. There are good reasons to be skeptical of doing this post-hoc analysis in the first place, but 1) it wasn't the analysis which led Biogen to claim a positive result in EMERGE, and 2) I would argue (to the extent we suspend our skepticism of its rationale) it's a more promising result than you've portrayed it as: the post-hoc analyses from both trials yielded very similar conclusions, both skirting the edge of statistical significance, and both with effect sizes of about a ~28.5% reduction in the pace of cognitive decline, which I'd consider more than a modest clinical benefit (that would buy 40% more time for someone).