Tuesday, February 5, 2013

The World's Worst Coin Trick?


Ben Goldacre – whose Bad Pharma went on sale today – is fond of using a coin-toss-cheating analogy to describe the problem of "hidden" trials in pharmaceutical clinical research. He uses it in this TED talk:
If it's a coin-toss conspiracy, it's the worst
one in the history of conspiracies.
If I flipped a coin a hundred times, but then withheld the results from you from half of those tosses, I could make it look as if I had a coin that always came up heads. But that wouldn't mean that I had a two-headed coin; that would mean that I was a chancer, and you were an idiot for letting me get away with it. But this is exactly what we blindly tolerate in the whole of evidence-based medicine. 
and in this recent op-ed column in the New York Times:
If I toss a coin, but hide the result every time it comes up tails, it looks as if I always throw heads. You wouldn't tolerate that if we were choosing who should go first in a game of pocket billiards, but in medicine, it’s accepted as the norm. 
I can understand why he likes using this metaphor. It's a striking and concrete illustration of his claim that pharmaceutical companies are suppressing data from clinical trials in an effort to make ineffective drugs appear effective. It also dovetails elegantly, from a rhetorical standpoint, with his frequently-repeated claim that "half of all trials go unpublished" (the reader is left to make the connection, but presumably it's all the tail-flip trials, with negative results, that aren't published).

Like many great metaphors, however, this coin-scam metaphor has the distinct weakness of being completely disconnected from reality.

If we can cheat and hide bad results, why do we have so many public failures? Pharmaceutical headlines in the past year were mostly dominated by a series of high-profile clinical trial failures. Even drugs that showed great promise in phase 2 failed in phase 3 and were discontinued. Less than 20% of drugs that start up in human testing ever make it to market ... and by some accounts it may be less than 10%. Pfizer had a great run of approvals to end 2012, with 4 new drugs approved by the FDA (including Xalkori, the exciting targeted therapy for lung cancer). And yet during that same period, the company discontinued 8 compounds.

Now, this wasn't always the case. Mandatory public registration of all pharma trials didn't begin in the US until 2005, and mandatory public results reporting came later than that. Before then, companies certainly had more leeway to keep results to themselves, with one important exception: the FDA still had the data. If you ran 4 phase 3 trials on a drug, and only 2 of them were positive, you might be able to only publish those 2, but when it came time to bring the drug to market, the regulators who reviewed your NDA report would be looking at the totality of evidence – all 4 trials. And in all likelihood you were going to be rejected.

That was definitely not an ideal situation, but even then it wasn't half as dire as Goldacre's Coin Toss would lead you to believe. The cases of ineffective drugs reaching the US market are extremely rare: if anything, FDA has historically been criticized for being too risk-averse and preventing drugs with only modest efficacy from being approved.

Things are even better now. There are no hidden trials, the degree of rigor (in terms of randomization, blinding, and analysis) has ratcheted up consistently over the last two decades, lots more safety data gets collected along the way, and phase 4 trials are actually being executed and reported in a timely manner. In fact, it is safe to say that medical research has never been as thorough and rigorous as it is today.

That doesn't mean we can’t get better. We can. But the main reason we can is that we got on the path to getting better 20 years ago, and continue to make improvements.

Buying into Goldacre's analogy requires you to completely ignore a massive flood of public evidence to the contrary. That may work for the average TED audience, but it shouldn't be acceptable at the level of rational public discussion.

Of course, Goldacre knows that negative trials are publicized all the time. His point is about publication bias. However, when he makes his point so broadly as to mislead those who are not directly involved in the R&D process, he has clearly stepped out of the realm of thoughtful and valid criticism.

I got my pre-ordered copy of Bad Pharma this morning, and look forward to reading it. I will post some additional thoughts on the book as I get through it. In the meantime,those looking for more can find a good skeptical review of some of Goldacre's data on the Dianthus Medical blog here and here.

[Image: Bad Pharma's Bad Coin courtesy of flikr user timparkinson.]

Friday, January 25, 2013

Less than Jaw-Dropping: Half of Sites Are Below Average


Last week, the Tufts Center for the Study of Drug Development unleashed the latest in their occasional series of dire pronouncements about the state of pharmaceutical clinical trials.

One particular factoid from the CSDD "study" caught my attention:
Shocking performance stat:
57% of these racers won't medal!
* 11% of sites in a given trial typically fail to enroll a single patient, 37% under-enroll, 39% meet their enrollment targets, and 13% exceed their targets.
Many industry reporters uncritically recycled those numbers. Pharmalot noted:
Now, the bad news – 48 percent of the trial sites miss enrollment targets and study timelines often slip, causing extensions that are nearly double the original duration in order to meeting enrollment levels for all therapeutic areas.
(Fierce Biotech and Pharma Times also picked up the same themes and quotes from the Tufts PR.)

There are two serious problems with the data as reported.

One: no one – neither CSDD nor the journalists who loyally recycle its press releases – seem to remember this CSDD release from less than two years ago. It made the even-direr claim that
According to Tufts CSDD, two-thirds of investigative sites fail to meet the patient enrollment requirements for a given clinical trial.
If you believe both Tufts numbers, then it would appear that the number of under-performing sites has dropped almost 20% in just 20 months – from 67% in April 2011 to 48% in January 2013. For an industry as hidebound and slow-moving as drug development, this ought to be hailed as a startling and amazing improvement!

Maybe at the end of the day, 48% isn't a great number, but surely this would appear to indicate we're on the right track, right? Why would no one mention this?

Which leads me to problem two: I suspect that no one is connecting the 2 data points because no one is sure what it is we're even supposed to be measuring here.

In a clinical trial, a site's "enrollment target" is not an objectively-defined number. Different sponsors will have different ways of setting targets – in fact, the method for setting targets may vary from team to team within a single pharma company.

The simplest way to set a target is to divide the total number of expected patients by the number of sites. If you have 50 sites and want to enroll 500 patients, then viola ... everyone's got a "target" of 10 patients! But then as soon as some sites start exceeding their target, others will, by definition, fall short. That’s not necessarily a sign of underperformance – in fact, if a trial finishes enrollment dramatically ahead of schedule, there will almost certainly be a large number of "under target" sites.

Some sponsors and CROs get tricky about setting individual targets for each site. How do they set those? The short answer is: pretty arbitrarily. Targets are only partially based upon data from previous, similar (but not identical) trials, but are also shifted up or down by the (real or perceived) commercial urgency of the trial. They can also be influenced by a variety of subjective beliefs about the study protocol and an individual study manager's guesses about how the sites will perform.

If a trial ends with 0% of sites meeting their targets, the next trial in that indication will have a lower, more achievable target. The same will happen in the other direction: too-easy targets will be ratcheted up. The benchmark will jump around quite a bit over time.

As a result, "Percentage of trial sites meeting enrollment target" is, to put it bluntly, completely worthless as an aggregate performance metric. Not only will it change greatly based upon which set  of sponsors and studies you happen to look at, but even data from the same sponsors will wobble heavily over time.

Why does this matter?

There is a consensus that clinical development is much too slow -- we need to be striving to shorten clinical trial timelines and get drugs to market sooner. If we are going to make any headway in this effort, we need to accurately assess the forces that help or hinder the pace of development, and we absolutely must rigorously benchmark and test our work. The adoption of, and attention paid to unhelpful metrics will only confuse and delay our effort to improve the quality of speed of drug development.

[Photo of "underperforming" swimmers courtesy Boston Public Library on flikr.]

Tuesday, January 15, 2013

Holding Your Breath Also Might Work

Here's a fitting postscript to yesterday's article about wishful-thinking-based enrollment strategies: we received a note from a research site this morning. The site had opted out of my company's comprehensive recruitment campaign, telling the sponsor they preferred to recruit patients their own way.

Here's the latest update from the coordinator:
I've found one person and have called a couple of times, but no return calls.  I will be sending this potential patient a letter this week.  I'm keeping my fingers crossed in finding someone soon!
They don't want to participate in a broad internet/broadcast/advocacy group program, but it's OK -- they have their fingers crossed!