Showing posts with label ethics. Show all posts
Showing posts with label ethics. Show all posts

Wednesday, February 22, 2017

Establishing efficacy - without humans?

The decade following passage of FDAAA has been one of easing standards for drug approvals in the US, most notably with the advent of “breakthrough” designation created by FDASIA in 2012 and the 21st Century Cures Act in 2016.

Although, as of this writing, there is no nominee for FDA Commissioner, it appears to be safe to say that the current administration intends to accelerate the pace of deregulation, mostly through further lowering of approval requirements. In fact, some of the leading contenders for the position are on record as supporting a return to pre-Kefauver-Harris days, when drug efficacy was not even considered for approval.
Build a better mouse model, and pharma will
beat a path to your door - no laws needed.

In this context, it is at least refreshing to read a proposal to increase efficacy standards. This comes from two bioethicists at McGill University, who make the somewhat-startling case for a higher degree of efficacy evaluation before a drug begins any testing in humans.
We contend that a lack of emphasis on evidence for the efficacy of drug candidates is all too common in decisions about whether an experimental medicine can be tested in humans. We call for infrastructure, resources and better methods to rigorously evaluate the clinical promise of new interventions before testing them on humans for the first time.
The author propose some sort of centralized clearinghouse to evaluate efficacy more rigorously. It is unclear what they envision this new multispecialty review body’s standards for green-lighting a drug to enter human testing. Instead they propose three questions:
  • What is the likelihood that the drug will prove clinically useful?
  • Assume the drug works in humans. What is the likelihood of observing the preclinical results?
  • Assume the drug does not work in humans. What is the likelihood of observing the preclinical results?
These seem like reasonable questions, I suppose – and are likely questions that are already being asked of preclinical data. They certainly do not rise to the level of providing a clear standard for regulatory approval, though perhaps it’s a reasonable place to start.

The most obvious counterargument here is one that the authors curiously don’t pick up on at all: if we had the ability to accurately (or even semiaccurately) predict efficacy preclinically, pharma sponsors would already be doing it. The comment notes: “More-thorough assessments of clinical potential before trials begin could lower failure rates and drug-development costs.” And it’s hard not to agree: every pharmaceutical company would love to have even an incrementally-better sense of whether their early pipeline drugs will be shown to work as hoped.

The authors note
Commercial interests cannot be trusted to ensure that human trials are launched only when the case for clinical potential is robust. We believe that many FIH studies are launched on the basis of flimsy, underscrutinized evidence.
However, they do not produce any evidence that industry is in any way deliberately underperforming their preclinical work, merely that preclinical efficacy is often difficult to reproduce and is poorly correlated with drug performance in humans.

Pharmaceutical companies have many times more candidate compounds than they can possibly afford to put into clinical trials. Figuring out how to lower failure rates – or at least the total cost of failure - is a prominent industry obsession, and efficacy remains the largest source of late-stage trial failure. This quest to “fail faster” has resulted in larger and more expensive phase 2 trials, and even to increased efficacy testing in some phase 1 trials. And we do this not because of regulatory pressure, but because of hopes that these efforts will save overall costs. So it seems beyond probable that companies would immediately invest more in preclinical efficacy testing, if such testing could be shown to have any real predictive power. But generally speaking, it does not.

As a general rule, we don’t need regulations that are firmly aligned with market incentives, we need regulations if and when we think those incentives might run counter to the general good. In this case, there are already incredibly strong market incentives to improve preclinical assessments. Where companies have attempted to do something with limited success, it would seem quixotic to think that regulatory fiat will accomplish more.

(One further point. The authors try to link the need for preclinical efficacy testing to the 2016 Bial tragedy. This seems incredibly tenuous: the authors speculate that perhaps trial participants would not have been harmed and killed if Bial had been required to produce more evidence of BIA102474’s clinical efficacy before embarking on their phase 1 trials. But that would have been entirely coincidental in this case: if the drug had in fact more evidence of therapeutic promise, the tragedy still would have happened, because it had nothing at all to do with the drug’s efficacy.

This is to some extent a minor nitpick, since the argument in favor of earlier efficacy testing does not depend on a link to Bial. However, I bring it up because a) the authors dedicate the first four paragraphs of their comment to the link, and b) there appears to be a minor trend of using the death and injuries of that trial to justify an array of otherwise-unrelated initiatives. This seems like a trend we should discourage.)

[Update 2/23: I posted this last night, not realizing that only a few hours earlier, John LaMattina had published on this same article. His take is similar to mine, in that he is suspicious of the idea that pharmaceutical companies would knowingly push ineffective drugs up their pipeline.] Kimmelman, J., & Federico, C. (2017). Consider drug efficacy before first-in-human trials Nature, 542 (7639), 25-27 DOI: 10.1038/542025a

Monday, November 21, 2016

The first paid research subject in written history?

On this date 349 years ago, Samuel Pepys relates in his famous diary a remarkable story about an upcoming medical experiment. As far as I can tell, this is the first written description of a paid research subject.

According to his account, the man (who he describes as “a little frantic”) was to be paid to undergo a blood transfusion from a sheep. It was hypothesized that the blood of this calm and docile animal would help to calm the man.

Some interesting things to note about this experiment:
  • Equipoise. There is explicit disagreement about what effect the experimental treatment will have: according to Pepys, "some think it may have a good effect upon him as a frantic man by cooling his blood, others that it will not have any effect at all".
  • Results published. An account of the experiment was published just two weeks later in the journal Philosophical Transactions
  • Medical Privacy. In this subsequent write-up, the research subject is identified as Arthur Coga, a former Cambridge divinity student. According to at least one account, being publicly identified had a bad effect on Coga, as people who had heard of him allegedly succeeded in getting him to spend his stipend on drink (though no sources are provided to confirm this story).
  • Patient Reported Outcome. Coga was apparently chosen because, although mentally ill, he was still considered educated enough to give an accurate description of the treatment effect. 
Depending on your perspective, this may also be a very early account of the placebo effect, or a classic case of ignoring the patient’s experience. Because even though his report was positive, the clinicians remained skeptical. From the journal article:
The Man after this operation, as well as in it, found himself very well, and hath given in his own Narrative under his own hand, enlarging more upon the benefit, he thinks, he hath received by it, than we think fit to own as yet.
…and in fact, a subsequent diary entry from Pepys mentions meeting Coga, with similarly mixed impressions: “he finds himself much better since, and as a new man, but he is cracked a little in his head”.

The amount Coga was paid for his participation? Twenty shillings – at the time, that was exactly one Guinea.

[Image credit: Wellcome Images]

Thursday, December 19, 2013

Patient Recruitment: Taking the Low Road

The Wall Street Journal has an interesting article on the use of “Big Data” to identify and solicit potential clinical trial participants. The premise is that large consumer data aggregators like Experian can target patients with certain diseases through correlations with non-health behavior. Examples given include “a preference for jazz” being associated with arthritis and “shopping online for clothes” being an indicator of obesity.
We've seen this story before.

In this way, allegedly, clinical trial patient recruitment companies can more narrowly target their solicitations* for patients to enroll in clinical trials.

In the spirit of full disclosure, I should mention that I was interviewed by the reporter of this article, although I am not quoted. My comments generally ran along three lines, none of which really fit in with the main storyline of the article:

  1. I am highly skeptical that these analyses are actually effective at locating patients
  2. These methods aren't really new – they’re the same tactics that direct marketers have been using for years
  3. Most importantly, the clinical trials community can – and should – be moving towards open and collaborative patient engagement. Relying on tactics like consumer data snooping and telemarketing is an enormous step backwards.

The first point is this: certainly some diseases have correlates in the real world, but these correlates tend to be pretty weak, and are therefore unreliable predictors of disease. Maybe it’s true that those struggling with obesity tend to buy more clothes online (I don’t know if it’s true or not – honestly it sounds a bit more like an association built on easy stereotypes than on hard data). But many obese people will not shop online (they will want to be sure the clothes actually fit), and vast numbers of people with low or average BMIs will shop for clothes online.  So the consumer data will tend to have very low predictive value. The claims that liking jazz and owning cats are predictive of having arthritis are even more tenuous. These correlates are going to be several times weaker than basic demographic information like age and gender. And for more complex conditions, these associations fall apart.

Marketers claim to solve this by factoring a complex web of associations through a magical black box – th WSJ article mentions that they “applied a computed algorithm” to flag patients. Having seen behind the curtain on a few of these magic algorithms, I can confidently say that they are underwhelming in their sophistication. Hand-wavy references to Big Data and Algorithms are just the tools used to impress pharma clients. (The down side to that, of course, is that you can’t help but come across as big brotherish – see this coverage from Forbes for a taste of what happens when people accept these claims uncritically.)

But the effectiveness of these data slice-n-dicing activities is perhaps beside the point. They are really just a thin cover for old-fashioned boiler room tactics: direct mail and telemarketing. When I got my first introduction to direct marketing in the 90’s, it was the exact same program – get lead lists from big companies like Experian, then aggressively mail and call until you get a response.

The limited effectiveness and old-school aggressiveness of these programs comes is nicely illustrated in the article by one person’s experience:
Larna Godsey, of Wichita, Kan., says she received a dozen phone calls about a diabetes drug study over the past year from a company that didn't identify itself. Ms. Godsey, 63, doesn't suffer from the disease, but she has researched it on the Internet and donated to diabetes-related causes. "I don't know if it's just a coincidence or if they're somehow getting my information," says Ms. Godsey, who filed a complaint with the FTC this year.
The article notes that one recruitment company, Acurian, has been the subject of over 500 FTC complaints regarding its tactics. It’s clear that Big Data is just the latest buzzword lipstick on the telemarketing pig. And that’s the real shame of it.

We have arrived at an unprecedented opportunity for patients, researchers, and private industry to come together and discuss, as equals, research priorities and goals. Online patient communities like Inspire and PatientsLikeMe have created new mechanisms to share clinical trial opportunities and even create new studies. Dedicated disease advocates have jumped right into the world of clinical research, with groups like the Cystic Fibrosis Foundation and Michael J. Fox Foundation no longer content with raising research funds, but actively leading the design and operations of new studies.

Some – not yet enough – pharmaceutical companies have embraced the opportunity to work more openly and honestly with patient groups. The scandal of stories like this is not the Wizard of Oz histrionics of secret computer algorithms, but that we as an industry continue to take the low road and resort to questionable boiler room tactics.

It’s past time for the entire patient recruitment industry to drop the sleaze and move into the 21st century. I would hope that patient groups and researchers will come together as well to vigorously oppose these kinds of tactics when they encounter them.

(*According to the article, Acurian "has said that calls related to medical studies aren't advertisements as defined by law," so we can agree to call them "solicitations".)

Thursday, May 30, 2013

Clinical Trial Enrollment, ASCO 2013 Edition

Even by the already-painfully-embarrassingly-low standards of clinical trial enrollment in general, patient enrollment in cancer clinical trials is slow. Horribly slow. In many cancer trials, randomizing one patient every three or four months isn't bad at all – in fact, it's par for the course. The most
commonly-cited number is that only 3% of cancer patients participate in a trial – and although exact details of how that number is measured are remarkably difficult to pin down, it certainly can't be too far from reality.

Ultimately, the cost of slow enrollment is borne almost entirely by patients; their payment takes the form of fewer new therapies and less evidence to support their treatment decisions.

So when a couple dozen thousand of the world's top oncologists fly into Chicago to meet, you'd figure that improving accrual would be high on everyone’s agenda. You can't run your trial without patients, after all.

But every year, the annual ASCO meeting underdelivers in new ideas for getting more patients into trials. I suppose this a consequence of ASCO's members-only focus: getting the oncologists themselves to address patient accrual is a bit like asking NASCAR drivers to tackle the problems of aerodynamics, engine design, and fuel chemistry.

Nonetheless, every year, a few brave souls do try. Here is a quick rundown of accrual-related abstracts at this year’s meeting, conveniently sorted into 3 logical categories:

1. As Lord Kelvin may or may not have said, “If you cannot measure it, you cannot improve it.”

Probably the most sensible of this year's crop, because rather than trying to make something out of nothing, the authors measure exactly how pervasive the nothing is. Specifically, they attempt to obtain fairly basic patient accrual data for the last three years' worth of clinical trials in kidney cancer. Out of 108 trials identified, they managed to get – via search and direct inquiries with the trial sponsors – basic accrual data for only 43 (40%).

That certainly qualifies as “terrible”, though the authors content themselves with “poor”.

Interestingly, exactly zero of the 32 industry-sponsored trials responded to the authors' initial survey. This fits with my impression that pharma companies continue to think of accrual data as proprietary, though what sort of business advantage it gives them is unclear. Any one company will have only run a small fraction of these studies, greatly limiting their ability to draw anything resembling a valid conclusion.

CALGB investigators look at 110 trials over the past 10 years to see if they can identify any predictive markers of successful enrollment. Unfortunately, the trials themselves are pretty heterogeneous (accrual periods ranged from 6 months to 8.8 years), so finding a consistent marker for successful trials would seem unlikely.

And, in fact, none of the usual suspects (e.g., startup time, disease prevalence) appears to have been significant. The exception was provision of medication by the study, which was positively associated with successful enrollment.

The major limitation with this study, apart from the variability of trials measured, is in its definition of “successful”, which is simply the total number of planned enrolled patients. Under both of their definitions, a slow-enrolling trial that drags on for years before finally reaching its goal is successful, whereas if that same trial had been stopped early it is counted as unsuccessful. While that sometimes may be the case, it's easy to imagine situations where allowing a slow trial to drag on is a painful waste of resources – especially if results are delayed enough to bring their relevance into question.

Even worse, though, is that a trial’s enrollment goal is itself a prediction. The trial steering committee determines how many sites, and what resources, will be needed to hit the number needed for analysis. So in the end, this study is attempting to identify predictors of successful predictions, and there is no reason to believe that the initial enrollment predictions were made with any consistent methodology.

2. If you don't know, maybe ask somebody?

With these two abstracts we celebrate and continue the time-honored tradition of alchemy, whereby we transmute base opinion into golden data. The magic number appears to be 100: if you've got 3 digits' worth of doctors telling you how they feel, that must be worth something.

In the first abstract, a working group is formed to identify and vote on the major barriers to accrual in oncology trials. Then – and this is where the magic happens – that same group is asked to identify and vote on possible ways to overcome those barriers.

In the second, a diverse assortment of community oncologists were given an online survey to provide feedback on the design of a phase 3 trial in light of recent new data. The abstract doesn't specify who was initially sent the survey, so we cannot tell response rate, or compare survey responders to the general population (I'll take a wild guess and go with “massive response bias”).

Market research is sometimes useful. But what cancer clinical trial do not need right now are more surveys are working groups. The “strategies” listed in the first abstract are part of the same cluster of ideas that have been on the table for years now, with no appreciable increase in trial accrual.

3. The obligatory “What the What?” abstract

The force with which my head hit my desk after reading this abstract made me concerned that it had left permanent scarring.

If this had been re-titled “Poor Measurement of Accrual Factors Leads to Inaccurate Accrual Reporting”, would it still have been accepted for this year’s meeting? That's certainly a more accurate title.

Let’s review: a trial intends to enroll both white and minority patients. Whites enroll much faster, leading to a period where only minority patients are recruited. Then, according to the authors, “an almost 4-fold increase in minority accrual raises question of accrual disparity.” So, sites will only recruit minority patients when they have no choice?

But wait: the number of sites wasn't the same during the two periods, and start-up times were staggered. Adjusting for actual site time, the average minority accrual rate was 0.60 patients/site/month in the first part and 0.56 in the second. So the apparent 4-fold increase was entirely an artifact of bad math.

This would be horribly embarrassing were it not for the fact that bad math seems to be endemic in clinical trial enrollment. Failing to adjust for start-up time and number of sites is so routine that not doing it is grounds for a presentation.

The bottom line

What we need now is to rigorously (and prospectively) compare and measure accrual interventions. We have lots of candidate ideas, and there is no need for more retrospective studies, working groups, or opinion polls to speculate on which ones will work best.  Where possible, accrual interventions should themselves be randomized to minimize confounding variables which prevent accurate assessment. Data needs to be uniformly and completely collected. In other words, the standards that we already use for clinical trials need to be applied to the enrollment measures we use to engage patients to participate in those trials.

This is not an optional consideration. It is an ethical obligation we have to cancer patients: we need to assure that we are doing all we can to maximize the rate at which we generate new evidence and test new therapies.

[Image credit: Logarithmic turtle accrual rates courtesy of Flikr user joleson.]

Wednesday, May 15, 2013

Placebos: Banned in Helsinki?

One of the unintended consequences of my (admittedly, somewhat impulsive) decision to name this blog is that I get a fair bit of traffic from Google: people searching for placebo-related information.

Some recent searches have been about the proposed new revisions to the Declaration of Helsinki, and how the new draft version will prohibit or restrict the use of placebo controls in clinical trials. This was a bit puzzling, given that the publicly-released draft revisions [PDF] didn't appear to substantially change the DoH's placebo section.

Much of the confusion appears to be caused by a couple sources. First, the popular Pharmalot blog (whose approach to critical analysis I've noted before as being ... well ... occasionally unenthusiastic) covered it thus:
The draft, which was released earlier this week, is designed to update a version that was adopted in 2008 and many of the changes focus on the use of placebos. For instance, placebos are only permitted when no proven intervention exists; patients will not be subject to any risk or there must be ‘compelling and sound methodological reasons’ for using a placebo or less effective treatment.
This isn't a good summary of the changes, since the “for instance” items are for the most part slight re-wordings from the 2008 version, which itself didn't change much from the version adopted in 2000.

To see what I mean, take a look at the change-tracked version of the placebo section:
The benefits, risks, burdens and effectiveness of a new intervention must be tested against those of the best current proven intervention(s), except in the following circumstances: 
The use of placebo, or no treatment intervention is acceptable in studies where no current proven intervention exists; or 
Where for compelling and scientifically sound methodological reasons the use of any intervention less effective than the best proven one, placebo or no treatment is necessary to determine the efficacy or safety of an intervention 
and the patients who receive any intervention less effective than the best proven one, placebo or no treatment will not be subject to any additional risks of serious or irreversible harm as a result of not receiving the best proven intervention 
Extreme care must be taken to avoid abuse of this option.
Really, there is only one significant change to this section: the strengthening of the existing reference to “best proven intervention” in the first sentence. It was already there, but has now been added to sentences 3 and 4. This is a reference to the use of active (non-placebo) comparators that are not the “best proven” intervention.

So, ironically, the biggest change to the placebo section is not about placebos at all.

This is a bit unfortunate, because to me it subtracts from the overall clarity of the section, since it's no longer exclusively about placebo despite still being titled “Use of Placebo”. The DoH has been consistently criticized during previous rounds of revision for becoming progressively less organized and coherently structured, and it certainly reads like a rambling list of semi-related thoughts – a classic “document by committee”. This lack of structure and clarity certainly hurt the DoH's effectiveness in shaping the world's approach to ethical clinical research.

Even worse, the revisions continue to leave unresolved the very real divisions that exist in ethical beliefs about placebo use in trials. The really dramatic revision to the placebo section happened over a decade ago, with the 2000 revision. Those changes, which introduced much of the strict wording in the current version, were extremely controversial, and resulted in the issuance of an extraordinary “Note of Clarification” that effectively softened the new and inflexible language. The 2008 version absorbed the wording from the Note of Clarification, and the resulting document is now vague enough that it is interpreted quite differently in different countries. (For more on the revision history and controversy, see this comprehensive review.)

The 2013 revision could have been an opportunity to try again to build a consensus around placebo use. At the very least, it could have acknowledged and clarified the division of beliefs on the topic. Instead, it sticks to its ambiguous phrasing which will continue to support multiple conflicting interpretations. This does not serve the ends of assuring the ethical conduct of clinical trials.

Ezekiel Emmanuel has been a long-time critic of the DoH's lack of clarity and structure. Earlier this month, he published a compact but forceful review of the ways in which the Declaration has become weakened by its long series of revisions:
Over the years problems with, and objections to, the document have accumulated. I propose that there are nine distinct problems with the current version of the Declaration of Helsinki: it has an incoherent structure; it confuses medical care and research; it addresses the wrong audience; it makes extraneous ethical provisions; it includes contradictions; it contains unnecessary repetitions; it uses multiple and poor phrasings; it includes excessive details; and it makes unjustified, unethical recommendations.
Importantly, Emmanuel also includes a proposed revision and restructuring of the DoH. In his version, much of the current wording around placebo use is retained, but it is absorbed into the larger concept of “Scientific Validity”, which adds important context to the decision about how to decide on a comparator arm in general.

Here is Emmanuel’s suggested revision:
Scientific Validity:  Research in biomedical and other sciences involving human participants must conform to generally accepted scientific principles, be based on a thorough knowledge of the scientific literature, other relevant sources of information, and suitable laboratory, and as necessary, animal experimentation.  Research must be conducted in a manner that will produce reliable and valid data.  To produce meaningful and valid data new interventions should be tested against the best current proven intervention. Sometimes it will be appropriate to test new interventions against placebo, or no treatment, when there is no current proven intervention or, where for compelling and scientifically sound methodological reasons the use of placebo is necessary to determine the efficacy and/or safety of an intervention and the patients who receive placebo, or no treatment, will not be subject to excessive risk or serious irreversible harm.  This option should not be abused.
Here, the scientific rationale for the use of placebo is placed in the greater context of selecting a control arm, which is itself subservient to the ethical imperative to only conduct studies that are scientifically valid. One can quibble with the wording (I still have issues with the use of “best proven” interventions, which I think is much too undefined here, as it is in the DoH, and glosses over some significant problems), but structurally this is a lot stronger, and provides firmer grounding for ethical decision making. Emanuel, E. (2013). Reconsidering the Declaration of Helsinki The Lancet, 381 (9877), 1532-1533 DOI: 10.1016/S0140-6736(13)60970-8

[Image: Extra-strength chill pill, modified by the author, based on an original image by Flikr user mirjoran.]

Tuesday, February 5, 2013

The World's Worst Coin Trick?

Ben Goldacre – whose Bad Pharma went on sale today – is fond of using a coin-toss-cheating analogy to describe the problem of "hidden" trials in pharmaceutical clinical research. He uses it in this TED talk:
If it's a coin-toss conspiracy, it's the worst
one in the history of conspiracies.
If I flipped a coin a hundred times, but then withheld the results from you from half of those tosses, I could make it look as if I had a coin that always came up heads. But that wouldn't mean that I had a two-headed coin; that would mean that I was a chancer, and you were an idiot for letting me get away with it. But this is exactly what we blindly tolerate in the whole of evidence-based medicine. 
and in this recent op-ed column in the New York Times:
If I toss a coin, but hide the result every time it comes up tails, it looks as if I always throw heads. You wouldn't tolerate that if we were choosing who should go first in a game of pocket billiards, but in medicine, it’s accepted as the norm. 
I can understand why he likes using this metaphor. It's a striking and concrete illustration of his claim that pharmaceutical companies are suppressing data from clinical trials in an effort to make ineffective drugs appear effective. It also dovetails elegantly, from a rhetorical standpoint, with his frequently-repeated claim that "half of all trials go unpublished" (the reader is left to make the connection, but presumably it's all the tail-flip trials, with negative results, that aren't published).

Like many great metaphors, however, this coin-scam metaphor has the distinct weakness of being completely disconnected from reality.

If we can cheat and hide bad results, why do we have so many public failures? Pharmaceutical headlines in the past year were mostly dominated by a series of high-profile clinical trial failures. Even drugs that showed great promise in phase 2 failed in phase 3 and were discontinued. Less than 20% of drugs that start up in human testing ever make it to market ... and by some accounts it may be less than 10%. Pfizer had a great run of approvals to end 2012, with 4 new drugs approved by the FDA (including Xalkori, the exciting targeted therapy for lung cancer). And yet during that same period, the company discontinued 8 compounds.

Now, this wasn't always the case. Mandatory public registration of all pharma trials didn't begin in the US until 2005, and mandatory public results reporting came later than that. Before then, companies certainly had more leeway to keep results to themselves, with one important exception: the FDA still had the data. If you ran 4 phase 3 trials on a drug, and only 2 of them were positive, you might be able to only publish those 2, but when it came time to bring the drug to market, the regulators who reviewed your NDA report would be looking at the totality of evidence – all 4 trials. And in all likelihood you were going to be rejected.

That was definitely not an ideal situation, but even then it wasn't half as dire as Goldacre's Coin Toss would lead you to believe. The cases of ineffective drugs reaching the US market are extremely rare: if anything, FDA has historically been criticized for being too risk-averse and preventing drugs with only modest efficacy from being approved.

Things are even better now. There are no hidden trials, the degree of rigor (in terms of randomization, blinding, and analysis) has ratcheted up consistently over the last two decades, lots more safety data gets collected along the way, and phase 4 trials are actually being executed and reported in a timely manner. In fact, it is safe to say that medical research has never been as thorough and rigorous as it is today.

That doesn't mean we can’t get better. We can. But the main reason we can is that we got on the path to getting better 20 years ago, and continue to make improvements.

Buying into Goldacre's analogy requires you to completely ignore a massive flood of public evidence to the contrary. That may work for the average TED audience, but it shouldn't be acceptable at the level of rational public discussion.

Of course, Goldacre knows that negative trials are publicized all the time. His point is about publication bias. However, when he makes his point so broadly as to mislead those who are not directly involved in the R&D process, he has clearly stepped out of the realm of thoughtful and valid criticism.

I got my pre-ordered copy of Bad Pharma this morning, and look forward to reading it. I will post some additional thoughts on the book as I get through it. In the meantime,those looking for more can find a good skeptical review of some of Goldacre's data on the Dianthus Medical blog here and here.

[Image: Bad Pharma's Bad Coin courtesy of flikr user timparkinson.]

Tuesday, July 31, 2012

Clouding the Debate on Clinical Trials: Pediatric Edition

I would like to propose a rule for clinical trial benchmarks. This rule may appear so blindingly obvious that I run the risk of seeming simple-minded and naïve for even bringing it up.

The rule is this: if you’re going to introduce a benchmark for clinical trial design or conduct, explain its value.

Are we not putting enough resources into pediatric research, or have we over-incentivized risky experimentation on a vulnerable population?  This is a critically important question in desperate need of more data and thoughtful analysis.
That’s it.  Just a paragraph explaining the rationale of why you’ve chosen to measure what you’re measuring.  Extra credit if you compare it to other benchmarks you could have used, or consider the limitations of your new metric.

I would feel bad for bringing this up, were it not for two recent articles in major publications that completely fail to live up to this standard. I’ll cover one today and one tomorrow.

The first is a recent article in Pediatrics, Pediatric Versus Adult Drug Trials for Conditions With High Pediatric Disease Burden, which has received a fair bit of attention in the industry -- mostly due to Reuters uncritically recycling the authors’ press release

It’s worth noting that the claim made in the release title, "Drug safety and efficacy in children is rarely addressed in drug trials for major diseases", is not at all supported by any data in the study itself. However, I suppose I can live with misleading PR.  What is frustrating is the inadequacy of the measures the authors use in the actual study, and the complete lack of discussion about them.

To benchmark where pediatric drug research should be, they use the proportion of total "burden of disease" borne by children.   Using WHO estimates, they look at the ratio of burden (measured, essentially, in years of total disability) between children and adults.  This burden is further divided into high-income countries and low/middle-income countries.

This has some surface plausibility, but presents a host of issues.  Simply looking at the relative prevalence of a condition does not really give us any insights into what we need to study about treatment.  For example: number 2 on the list for middle/low income diseases is diarrheal illness, where WHO lists the burden of disease as 90% pediatric.  There is no question that diarrheal diseases take a terrible toll on children in developing countries.  We absolutely need to focus resources on improving prevention and treatment: what we do not particularly need is more clinical trials.  As the very first bullet on the WHO fact sheet points out, diarrheal diseases are preventable and treatable.  Prevention is mostly about improving the quality of water and food supplies – this is vitally important stuff, but it has nothing to do with pharmaceutical R&D.

In the US, the NIH’s National Institute for Child Health and Human Development (NICHD) has a rigorous process for identifying and prioritizing needs for pediatric drug development, as mandated by the BPCA.  It is worth noting that only 2 of the top 5 diseases in the Pediatrics article make the cut among the 41 highest-priority areas in the NICHD’s list for 2011.

(I don’t even think the numbers as calculated by the authors are even convincing on their own terms:  3 of the 5 "high burden" diseases in wealthy countries – bipolar, depression, and schizophrenia – are extremely rare in very young children, and only make this list because of their increasing incidence in adolescence.  If our objective is to focus on how these drugs may work differently in developing children, then why wouldn’t we put greater emphasis on the youngest cohorts?)

Of course, just because a new benchmark is at odds with other benchmarks doesn’t necessarily mean that it’s wrong.  But it does mean that the benchmark requires some rigorous vetting before its used.  The authors make no attempt at explaining why we should use their metric, except to say it’s "apt". The only support provided is a pair of footnotes – one of those, ironically, is to this article from 1999 that contains a direct warning against their approach:
Our data demonstrate how policy makers could be misled by using a single measure of the burden of disease, because the ranking of diseases according to their burden varies with the different measures used.
If we’re going to make any progress in solving the problems in drug development – and I think we have a number of problems that need solving – we have got to start raising our standards for our own metrics.

Are we not putting enough resources into pediatric research, or have we over-incentivized risky experimentation on a vulnerable population? This is a critically important question in desperate need of more data and thoughtful analysis. Unfortunately, this study adds more noise than insight to the debate.

Tomorrow In a couple weeks, I’ll cover the allegations about too many trials being too small. [Update: "tomorrow" took a little longer than expected. Follow up post is here.]

[Note: the Pediatrics article also uses another metric, "Percentage of Trials that Are Pediatric", that is used as a proxy for amount of research effort being done.  For space reasons, I’m not going to go into that one, but it’s every bit as unhelpful as the pediatric burden metric.] Bourgeois FT, Murthy S, Pinto C, Olson KL, Ioannidis JP, & Mandl KD (2012). Pediatric Versus Adult Drug Trials for Conditions With High Pediatric Disease Burden. Pediatrics PMID: 22826574

Tuesday, July 24, 2012

How Not to Report Clinical Trial Data: a Clear Example

I know it’s not even August yet, but I think we can close the nominations for "Worst Trial Metric of the Year".  The hands-down winner is Pharmalot, for the thoughtless publication of this article reviewing "Deaths During Clinical Trials" per year in India.  We’ll call it the Pharmalot Death Count, or PDC, and its easy to explain – it's just the total number of patients who died while enrolled in any clinical trial, regardless of cause, and reported as though it were an actual meaningful number.

(To make this even more execrable, Pharmalot actually calls this "Deaths attributed to clinical trials" in his opening sentence, although the actual data has exactly nothing to do with the attribution of the death.)

In fairness, Pharmalot is really only sharing the honors with a group of sensationalistic journalists in India who have jumped on these numbers.  But it has a much wider readership within the research community, and could have at least attempted to critically assess the data before repeating it (along with criticism from "experts").

The number of things wrong with this metric is a bit overwhelming.  I’m not even sure where to start.  Some of the obvious issues here:

1. No separation of trial-related versus non-trial-related.  Some effort is made to explain that there may be difficulty in determining whether a particular death was related to the study drug or not.  However, that obscures the fact that the PDC lumps together all deaths, whether they took an experimental medication or not. That means the PDC includes:
  • Patients in control arms receiving standard of care and/or placebo, who died during the course of their trial.
  • Patients whose deaths were entirely unrelated to their illness (eg, automobile accident victims)
2. No base rates.  When a raw death total is presented, a number of obvious questions should come to mind:  how many patients were in the trials?  How many deaths were there in patients with similar diseases who were not in trials?  The PDC doesn’t care about that kind of context

3. No sensitivity to trial design.  Many late-stage cancer clinical trials use Overall Survival (OS) as their primary endpoint – patients are literally in the trial until they die.  This isn’t considered unethical; it’s considered the gold standard of evidence in oncology.  If we ran shorter, less thorough trials, we could greatly reduce the PDC – would that be good for anyone?

Case Study: Zelboraf
FDA: "Highly effective, more personalized therapy"
PDC: "199 deaths attributed to Zelboraf trial!"
There is a fair body of evidence that participants in clinical trials fare about the same as (or possibly a bit better than) similar patients receiving standard of care therapy.  However, much of that evidence was accumulated in western countries: it is a fair question to ask if patients in India and other countries receive a similar benefit.  The PDC, however, adds nothing to our ability to answer that question.

So, for publicizing a metric that has zero utility, and using it to cast aspersions on the ethics of researchers, we congratulate Pharmalot and the PDC.

Friday, July 6, 2012

A placebo control is not a placebo effect

Following up on yesterday's post regarding a study of placebo-related information, it seems worthwhile to pause and expand on the difference between placebo controls and placebo effects.

The very first sentence of the study paper reflects a common, and rather muddled, belief about placebo-controlled trials:
Placebo groups are used in trials to control for placebo effects, i.e. those changes in a person's health status that result from the meaning and hope the person attributes to a procedure or event in a health care setting.
The best I can say about the above sentence is that in some (not all) trials, this accounts for some (not all) of the rationale for including a placebo group in the study design. 

There is no evidence that “meaning and hope” have any impact on HbA1C levels in patients with diabetes. The placebo effect only goes so far, and certainly doesn’t have much sway over most lab tests.  And yet we still conduct placebo-controlled trials in diabetes, and rightly so. 

To clarify, it may be helpful to break this into two parts:
  1. Most trials need a “No Treatment” arm. 
  2. Most “No Treatment” arms should be double-blind, which requires use of a placebo.
Let’s take these in order.

We need a “No Treatment” arm:
  • Where the natural progression of the disease is variable (e.g., many psychological disorders, such as depression, have ups and downs that are unrelated to treatment).  This is important if we want to measure the proportion of responders – for example, what percentage of diabetes patients got their HbA1C levels below 6.5% on a particular regimen.  We know that some patients will hit that target even without additional intervention, but we won’t know how many unless we include a control group.
  • Where the disease is self-limiting.  Given time, many conditions – the flu, allergies, etc. – tend to go away on their own.  Therefore, even an ineffective medication will look like it’s doing something if we simply test it on its own.  We need a control group to measure whether the investigational medication is actually speeding up the time to cure.
  • When we are testing the combination of an investigational medication with one or more existing therapies. We have a general sense of how well metformin will work in T2D patients, but the effect will vary from trial to trial.  So if I want to see how well my experimental therapy works when added to metformin, I’ll need a metformin-plus-placebo control arm to be able to measure the additional benefit, if any.

All of the above are especially important when the trial is selecting a group of patients with greater disease severity than average.  The process of “enriching” a trial by excluding patients with mild disease has the benefit of requiring many fewer enrolled patients to demonstrate a clinical effect.  However, it also will have a stronger tendency to exhibit “regression to the mean” for a number of patients, who will exhibit a greater than average improvement during the course of the trial.  A control group accurately measures this regression and helps us measure the true effect size.

So, why include a placebo?  Why not just have a control group of patients receiving no additional treatment?  There are compelling reasons:
  • To minimize bias in investigator assessments.  We most often think about placebo arms in relation to patient expectations, but often they are even more valuable in improving the accuracy of physician assessments.  Like all humans, physician investigators interpret evidence in light of their beliefs, and there is substantial evidence that unblinded assessments exaggerate treatment effects – we need the placebo to help maintain investigator blinding.
  • To improve patient compliance in the control arm.  If a patient is clearly not receiving an active treatment, it is often very difficult to keep him or her interested and engaged with the trial, especially if the trial requires frequent clinic visits and non-standard procedures (such as blood draws).  Retention in no-treatment trials can be much lower than in placebo-controlled trials, and if it drops low enough, the validity of any results can be thrown into question.
  • To accurately gauge adverse events.  Any problem(s) encountered are much more likely to be taken seriously – by both the patient and the investigator – if there is genuine uncertainty about whether the patient is on active treatment.  This leads to much more accurate and reliable reporting of adverse events.
In other words, even if the placebo effect didn’t exist, it would still be necessary and proper to conduct placebo-controlled trials.  The failure to separate “placebo control” from “placebo effect” yields some very muddled thinking (which was the ultimate point of my post yesterday).

Thursday, July 5, 2012

The Placebo Effect (No Placebo Necessary)

4 out of 5 non-doctors recommend starting
with "regular strength", and titrating up from there...
(Photo from
The modern clinical trial’s Informed Consent Form (ICF) is a daunting document.  It is packed with a mind-numbing litany of procedures, potential risks, possible adverse events, and substantial additional information – in general, if someone, somewhere, might find a fact relevant, then it gets into the form.  A run-of-the-mill ICF in a phase 2 or 3 pharma trial can easily run over 10 pages of densely worded text.  You might argue (and in fact, a number of people have, persuasively) that this sort of information overload reduces, rather than enhances, patient understanding of clinical trials.

So it is a bit of a surprise to read a paper arguing that patient information needs to be expanded because it does not contain enough information.  And it is yet even more surprising to read about what’s allegedly missing: more information about the potential effects of placebo.

Actually, “surprising” doesn’t really begin to cover it.  Reading through the paper is a borderline surreal experience.  The authors’ conclusions from “quantitative analysis”* of 45 Patient Information Leaflets for UK trials include such findings as
  • The investigational medication is mentioned more often than the placebo
  • The written purpose of the trial “rarely referred to the placebo”
  • “The possibility of continuing on the placebo treatment after the trial was never raised explicitly”
(You may need to give that last one a minute to sink in.)

Rather than seeing these as rather obvious conclusions, the authors recast them as ethical problems to be overcome.  From the article:
Information leaflets provide participants with a permanent written record about a clinical trial and its procedures and thus make an important contribution to the process of informing participants about placebos.
And from the PR materials furnished along with publication:
We believe the health changes associated with placebos should be better represented in the literature given to patients before they take part in a clinical trial.
There are two points that I think are important here – points that are sometimes missed, and very often badly blurred, even within the research community:

1.    The placebo effect is not caused by placebos.  There is nothing special about a “placebo” treatment that induces a unique effect.  The placebo effect can be induced by a lot of things, including active medications.  When we start talking about placebos as causal agents, we are engaging in fuzzy reasoning – placebo effects will not only be seen in the placebo arm, but will be evenly distributed among all trial participants.

2.    Changes in the placebo arm cannot be assumed to be caused by the placebo effect.  There are many reasons why we may observe health changes within a placebo group, and most of them have nothing to do with the “psychological and neurological mechanisms” of the placebo effect.  Giving trial participant information about the placebo effect may in fact be providing them with an entirely inaccurate description of what is going on. Bishop FL, Adams AEM, Kaptchuk TJ, Lewith GT (2012). Informed Consent and Placebo Effects: A Content Analysis of Information Leaflets to Identify What Clinical Trial Participants Are Told about Placebos. PLoS ONE DOI: 10.1371/journal.pone.0039661  

(* Not related to the point at hand, but I would applaud efforts to establish some lower boundaries to what we are permitted to call "quantitative analysis".  Putting counts from 45 brochures into an Excel spreadsheet should fall well below any reasonable threshold.)

Thursday, March 24, 2011

People Who Disagree with Me Tend to End Up Being Investigated by the Federal Government

I don’t think this qualifies yet as a trend, but two disturbing announcements came right back to back last week:

First: As you’ve probably heard, KV Pharmaceutical caused quite a stir when they announced the pricing for their old-yet-new drug Makena. In response, Senators Sherrod Brown (D-OH) and Amy Klobuchar (D-MN) sent a letter to the FTC demanding they “initiate a formal investigation into any potential anticompetitive conduct” by KV. In explaining his call for the investigation, Brown notes:

Since KV Pharmaceuticals announced the intended price hike, I called on KV Pharmaceuticals to immediately reconsider their decision, but to this date the company continues to defend this astronomical price increase.

Second: One week after an FDA Advisory Committee voted 13 to 4 to recommend approving Novartis’s COPD drug indacaterol, Public Citizen wrote a letter to the US Office of Human Research Protections requesting the Novartis be investigated for conducting the very trials that supplied the evidence for that vote. The reason? Despite the fact that the FDA requested the trials be placebo controlled, Public Citizen feels that Novartis should not have allowed patients to be on placebo. The letter shows no apparent consideration for the idea that a large number of thoughtful, well-informed people considered the design of these trials and came to the conclusion that they were ethical (not only the FDA, but the independent Institutional Review Boards and Ethics Committees that oversaw each trial). Instead, Public Citizen blithely “look[s] forward to OHRP’s thorough and careful investigation of our allegations.”

The upshot of these two announcements seems to be: “we don’t like what you’re doing, and since we can’t get you to stop, we’ll try to initiate a federal investigation.” Even if neither of these efforts succeed they will still cause the companies involved to spend a significant amount of time and money defending themselves. In fact, maybe that’s the point: neither effort seems like a serious claim that actual laws were broken, but rather just an attempt at intimidation.

Monday, March 21, 2011

From Russia with (3 to 20 times more) Love

Russia’s Clinical Trials are a Thriving Business”, trumpeted the news release that came to my inbox the other day. Inside was a rather startling – and ever-so-slightly odd – claim:
NPR Marketplace Health Desk Reporter Gregory Warner uncovers the truths about clinical trials in Russia; namely, the ability for biopharmaceutical companies to enroll patients 3 to 20 times faster than in the more established regions of North America and Western Europe.
Of course, as you might expect, the NPR reporter does not “uncover” that – rather, the 3 to 20 times faster “truth” is simply a verbatim statement from the CEO of ClinStar, a CRO specializing in running trials in Russia and Eastern Europe. There is no explanation of the 3-to-20 number, or why there is such a wide confidence interval (if that’s what that is).

The full NPR story goes on to hint that the business of Russian clinical trials may be a bit on the ethically cloudy side by associating it with past practices of lavishing gifts and attention on leading physicians (no direct tie is made – the reporter however not so subtly notes the fact that one person who used to work in Russia as a drug rep now works in clinical trials). I think the implication here is that Russia gets results by any means necessary, and the pharma industry is excitedly queuing up to get its trials done faster.

However, this speed factor is coupled with the extremely modest claim that clinical trial business in Russia is “growing at 15% a years.” While this is certainly not a bad rate of growth, it’s hardly explosive. It’s in fact comparable to the revenue growth of the overall CRO market for the few years preceding the current downturn, estimated at 12.2%, and dwarfed by the estimated 34% annual growth of the industry in India.

From my perspective, the industry seems very hesitant to put too many eggs in Eastern Europe’s basket just yet. We need faster trials, certainly, but we need reliable and clean data even more. Recent troubling research experience with Russia -- most notably the dimebon fiasco, where overwhelming positive data in Russian phase 2 trials have turned out to be completely irreproducible in larger western trials –has left the industry wary about the region. And wink-and-nod publicity about incredible speed gains probably will ultimately hurt wider acceptance of Eastern European trials more than it will help.

Wednesday, March 16, 2011

Realistic Optimism in Clinical Trials

The concept of “unrealistic optimism” among clinical trial participants has gotten a fair bit of press lately, mostly due to a small study published in IRB: Ethics and Human Research. (I should stress the smallness of the study: it was a survey given to 72 blood cancer patients. This is worth noting in light of the slightly-bizarre Medscape headline that optimism “plagues” clinical trials.)

I was therefore happy to see this article reporting out of the Society for Surgical Oncology. In looking at breast cancer outcomes between surgical oncologists and general surgeons, the authors appear to have found that most of the beneficial outcomes among patients treated by surgical oncologist can be ascribed to clinical trial participation. Some major findings:
  • 56% of patients treated by a surgical oncologist participated in a trial, versus only 7% of those treated by a general surgeon
  • Clinical trial patients had significantly longer median follow-up than non-participants (44.6 months vs. 38.5 months)
  • Most importantly, clinical trial patients had significantly improved overall survival at 5 years than non-participants (31% vs. 26%)

Of course, the study reported on in the IRB article did not compare non-trial participants’ attitudes, so these aren’t necessarily contradictory results. However, I suspect that the message of “clinical trial participation” entails “better follow-up” entails “improved outcomes” will not get the same eye-catching headline in Medscape. Which is a shame, since we already have enough negative press about clinical trials out there.

Tuesday, March 1, 2011

What is the Optimal Rate of Clinical Trial Participation?

The authors of EDICT's white paper, in their executive summary, take a bleak view of the current state of clinical trial accrual:

Of critical concern is the fact that despite numerous years of discussion and the implementation of new federal and state policies, very few Americans actually take part in clinical trials, especially those at greatest risk for disease. Of the estimated 80,000 clinical trials that are conducted every year in the U.S., only 2.3 million Americans take part in these research studies -- or less than one percent of the entire U.S. population.
The paper goes on to discuss the underrepresentation of minority populations in clinical trials, and does not return to this point. And while it's certainly not central to the paper's thesis (in fact, in some ways it works against it), it is a perception that certainly appears to a common one among those involved in clinical research.

When we say that "only" 2.3 million Americans take part in clinical research, we rely directly on an assumption that more than 2.3 million Americans should take part.

This leads immediately to the question: how many more?

If we are trying to increase participation rates, the magnitude of the desired improvement is one of the first and most central facts we need. Do we want a 10% increase, or a 10-fold increase? The steps required to achieve these will be radically different, so it would seem important to know.

It should also be pointed out: in some very real sense, the ideal rate of clinical trial participation, at least for pre-marketing trials, is 0%. Participating in these trial by definition means being potentially exposed to a treatment that the FDA believes has insufficient evidence of safety and/or efficacy. In an ideal world, we would not expose any patient to that risk. Even in today's non-ideal world, we have already decided not to expose any patients to medication that have not produced some preliminary evidence of safety and efficacy in animals. That is, we have already established one threshold below which we believe human involvement is unacceptably risky -- in a better world, with more information, we would raise that threshold much higher than the current criteria for IND approval.

This is not just a hypothetical concern. Where we set our threshold for acceptable risk should drive much of our thinking about how much we want to encourage (or discourage) people from shouldering that risk. Landmine detection, for example, is a noble but risky profession: we may agree that it is acceptable for rational adults to choose to enter into that field, and we may certainly applaud their heroism. However, that does not mean that we will unanimously agree on how many adults should be urged to join their ranks, nor does it mean that we will not strive and hope for the day that no human is exposed to that risk.

So, we're not talking about the ideal rate of participation, we're talking about the optimal rate. How many people should get involved, given a) the risks involved in being exposed to investigational treatment, against b) the potential benefit to the participant and/or mankind? For how many will the expected potential benefit outweigh the expected total cost? I have not seen any systematic attempt to answer this question.

The first thing that should be obvious here is that the optimal rate of participation should vary based upon the severity of the disease and the available, approved medications to treat it. In nonserious conditions (eg, keratosis pilaris), and/or conditions with a very good recovery rate (eg, veisalgia), we should expect participation rates to be low, and in some cases close to zero in the absence of major potential benefit. Conversely, we should desire higher participation rates in fatal conditions with few if any legitimate treatment alternatives (eg, late-stage metastatic cancers). In fact, if we surveyed actual participation rates by disease severity and prognosis, I think we would find that this relationship generally holds true already.

I should qualify the above by noting that it really doesn't apply to a number of clinical trial designs, most notably observational trials and phase 1 studies in healthy volunteers. Of course, most of the discussion around clinical trial participation does not apply to these types of trials, either, as they are mostly focused on access to novel treatments.