Placebo Control

Tuesday, June 4, 2013

Can FDA's New Transparency Survive Avandia?

PDUFA V commitments signal a strong commitment to tolerance of open debate in the face of uncertainty.

I can admit to a rather powerful lack of enthusiasm when reading about interpersonal squabbles. It’s even worse in the scientific world: when I read about debates getting mired in personal attacks I tend to simply stop reading and move on to something else.

However, the really interesting part of this week’s meeting of an FDA joint Advisory Committee to discuss the controversial diabetes drug Avandia – at least in the sense of likely long-term impact – is not the scientific question under discussion, but the surfacing and handling of the raging interpersonal battle going on right now inside the Division of Cardiovascular and Renal Products. So I'll have to swallow my distaste and follow along with the drama.

Two words that make us mistrust Duke:
~~Anil Potti~~ Christian Laettner

Not that the scientific question at hand – does Avandia pose significant heart risks? – isn't interesting. It is. But if there’s one thing that everyone seems to agree on, it’s that we don’t have good data on the topic. Despite the re-adjudication of RECORD, no one trusts its design (and, ironically, the one trial with a design to rigorously answer the question was halted after intense pressure, despite an AdComm recommendation that it continue). And no one seems particularly enthused about changing the current status of Avandia: in all likelihood it will continue to be permitted to be marketed under heavy restrictions. Rather than changing the future of diabetes, I suspect the committee will be content to let us slog along the same mucky trail.

The really interesting question, that will potentially impact CDER for years to come, is how it can function with frothing, open dissent among its staffers. As has been widely reported, FDA reviewer Tom Marciniak has written a rather wild and vitriolic assessment of the RECORD trial, excoriating most everyone involved. In a particularly stunning passage, Marciniak appears to claim that the entire output of anyone working at Duke University cannot be trusted because of the fraud committed by Duke cancer researcher Anil Potti:

I would have thought that the two words “Anil Potti” are sufficient for convincing anyone that Duke University is a poor choice for a contractor whose task it is to confirm the integrity of scientific research.

(One wonders how far Marciniak is willing to take his guilt-by-association theme. Are the words “Cheng Yi Liang” sufficient to convince us that all FDA employees, including Marciniak, are poor choices for deciding matter relating to publicly-traded companies? Should I not comment on government activities because I’m a resident of Illinois (my two words: “Rod Blagojevich”)?)

Rather than censoring or reprimanding Marciniak, his supervisors have taken the extraordinary step of letting him publicly air his criticisms, and then they have in turn publicly criticized his methods and approach.

I have been unable to think of a similar situation at any regulatory agency. The tolerance for dissent being displayed by FDA is, I believe, completely unprecedented.

And that’s the cliffhanger for me: can the FDA’s commitment to transparency extend so far as to accommodate public disagreements about its own approval decisions? Can it do so even when the disagreements take an extremely nasty and inappropriate tone?

Rather than considering that open debate is a good thing, will journalists jump on the drama and portray agency leadership as weak and indecisive?
Will the usual suspects in Congress be able to exploit this disagreement for their own political gain? How many House subcommittees will be summoning Janet Woodcock in the coming weeks?

I think what Bob Temple and Norman Stockbridge are doing is a tremendous experiment in open government. If they can pull it off, it could force other agencies to radically rethink how they go about crafting and implementing regulations. However, I also worry that it is politically simply not a viable approach, and that the agency will ultimately be seriously hurt by attacks from the media and legislators.

Where is this coming from?

As part of its recent PDUFA V commitment, the FDA put out a fascinating draft document, Structured Approach to Benefit-Risk Assessment in Drug Regulatory Decision-Making. It didn't get a lot of attention when first published back in February (few FDA documents do). However, it lays out a rather bold vision for how the FDA can acknowledge the existence of uncertainty in its evaluation of new drugs. Its proposed structure even envisions an open and honest accounting of divergent interpretations of data:

When they're frothing at the mouth, even Atticus
doesn't let them publish a review

A framework for benefit-risk decision-making that summarizes the relevant facts, uncertainties, and key areas of judgment, and clearly explains how these factors influence a regulatory decision, can greatly inform and clarify the regulatory discussion. Such a framework can provide transparency regarding the basis of conflicting recommendations made by different parties using the same information.

(Emphasis mine.)

Of course, the structured framework here is designed to reflect rational disagreement. Marciniak’s scattershot insults are in many ways a terrible first case for trying out a new level of transparency.

The draft framework notes that safety issues, like Avandia, are some of the major areas of uncertainty in the regulatory process. Contrast this vision of coolly and systematically addressing uncertainties with the sad reality of Marciniak’s attack:

In contrast to the prospective and highly planned studies of effectiveness, safety findings emerge from a wide range of sources, including spontaneous adverse event reports, epidemiology studies, meta-analyses of controlled trials, or in some cases from randomized, controlled trials. However, even controlled trials, where the evidence of an effect is generally most persuasive, can sometimes provide contradictory and inconsistent findings on safety as the analyses are in many cases not planned and often reflect multiple testing. A systematic approach that specifies the sources of evidence, the strength of each piece of evidence, and draws conclusions that explain how the uncertainty weighed on the decision, can lead to more explicit communication of regulatory decisions. We anticipate that this work will continue beyond FY 2013.

I hope that work will continue beyond 2013. Thoughtful, open discussions of real uncertainties are one of the most worthwhile goals FDA can aspire to, even if it means having to learn how to do so without letting the Marciniaks of the world scuttle the whole endeavor.

[Update June 6: Further bolstering the idea that the AdCom is just as much about FDA's ability to transparently manage differences of expert opinion in the face of uncertain data, CDER Director Janet Woodcock posted this note on the FDA's blog. She's pretty explicit about the bigger picture:

There have been, and continue to be, differences of opinion and scientific disputes, which is not uncommon within the agency, stemming from varied conclusions about the existing data, not only with Avandia, but with other FDA-regulated products.

At FDA, we actively encourage and welcome robust scientific debate on the complex matters we deal with — as such a transparent approach ensures the scientific input we need, enriches the discussions, and enhances our decision-making.

I agree, and hope she can pull it off.]

Thursday, May 30, 2013

Clinical Trial Enrollment, ASCO 2013 Edition

Even by the already-painfully-embarrassingly-low standards of clinical trial enrollment in general, patient enrollment in cancer clinical trials is slow. Horribly slow. In many cancer trials, randomizing one patient every three or four months isn't bad at all – in fact, it's par for the course. The most

commonly-cited number is that only 3% of cancer patients participate in a trial – and although exact details of how that number is measured are remarkably difficult to pin down, it certainly can't be too far from reality.

Ultimately, the cost of slow enrollment is borne almost entirely by patients; their payment takes the form of fewer new therapies and less evidence to support their treatment decisions.

So when a couple dozen thousand of the world's top oncologists fly into Chicago to meet, you'd figure that improving accrual would be high on everyone’s agenda. You can't run your trial without patients, after all.

But every year, the annual ASCO meeting underdelivers in new ideas for getting more patients into trials. I suppose this a consequence of ASCO's members-only focus: getting the oncologists themselves to address patient accrual is a bit like asking NASCAR drivers to tackle the problems of aerodynamics, engine design, and fuel chemistry.

Nonetheless, every year, a few brave souls do try. Here is a quick rundown of accrual-related abstracts at this year’s meeting, conveniently sorted into 3 logical categories:

1. As Lord Kelvin may or may not have said, “If you cannot measure it, you cannot improve it.”

Abstract e15572: Inadequate data availability on clinical trial accrual and its effect on progress in cancer research

Probably the most sensible of this year's crop, because rather than trying to make something out of nothing, the authors measure exactly how pervasive the nothing is. Specifically, they attempt to obtain fairly basic patient accrual data for the last three years' worth of clinical trials in kidney cancer. Out of 108 trials identified, they managed to get – via search and direct inquiries with the trial sponsors – basic accrual data for only 43 (40%).

That certainly qualifies as “terrible”, though the authors content themselves with “poor”.

Interestingly, exactly zero of the 32 industry-sponsored trials responded to the authors' initial survey. This fits with my impression that pharma companies continue to think of accrual data as proprietary, though what sort of business advantage it gives them is unclear. Any one company will have only run a small fraction of these studies, greatly limiting their ability to draw anything resembling a valid conclusion.

Abstract TPS6645: Predictors of accrual success for cooperative group trials: The Cancer and Leukemia Group B (Alliance) experience

CALGB investigators look at 110 trials over the past 10 years to see if they can identify any predictive markers of successful enrollment. Unfortunately, the trials themselves are pretty heterogeneous (accrual periods ranged from 6 months to 8.8 years), so finding a consistent marker for successful trials would seem unlikely.

And, in fact, none of the usual suspects (e.g., startup time, disease prevalence) appears to have been significant. The exception was provision of medication by the study, which was positively associated with successful enrollment.

The major limitation with this study, apart from the variability of trials measured, is in its definition of “successful”, which is simply the total number of planned enrolled patients. Under both of their definitions, a slow-enrolling trial that drags on for years before finally reaching its goal is successful, whereas if that same trial had been stopped early it is counted as unsuccessful. While that sometimes may be the case, it's easy to imagine situations where allowing a slow trial to drag on is a painful waste of resources – especially if results are delayed enough to bring their relevance into question.

Even worse, though, is that a trial’s enrollment goal is itself a prediction. The trial steering committee determines how many sites, and what resources, will be needed to hit the number needed for analysis. So in the end, this study is attempting to identify predictors of successful predictions, and there is no reason to believe that the initial enrollment predictions were made with any consistent methodology.

2. If you don't know, maybe ask somebody?

With these two abstracts we celebrate and continue the time-honored tradition of alchemy, whereby we transmute base opinion into golden data. The magic number appears to be 100: if you've got 3 digits' worth of doctors telling you how they feel, that must be worth something.

In the first abstract, a working group is formed to identify and vote on the major barriers to accrual in oncology trials. Then – and this is where the magic happens – that same group is asked to identify and vote on possible ways to overcome those barriers.

In the second, a diverse assortment of community oncologists were given an online survey to provide feedback on the design of a phase 3 trial in light of recent new data. The abstract doesn't specify who was initially sent the survey, so we cannot tell response rate, or compare survey responders to the general population (I'll take a wild guess and go with “massive response bias”).

Market research is sometimes useful. But what cancer clinical trial do not need right now are more surveys are working groups. The “strategies” listed in the first abstract are part of the same cluster of ideas that have been on the table for years now, with no appreciable increase in trial accrual.

3. The obligatory “What the What?” abstract

Abstract 6564: Minority accrual on a prospective study targeting a diverse U.S. breast cancer population: An analysis of Wake Forest CCOP research base protocol 97609

The force with which my head hit my desk after reading this abstract made me concerned that it had left permanent scarring.

If this had been re-titled “Poor Measurement of Accrual Factors Leads to Inaccurate Accrual Reporting”, would it still have been accepted for this year’s meeting? That's certainly a more accurate title.

Let’s review: a trial intends to enroll both white and minority patients. Whites enroll much faster, leading to a period where only minority patients are recruited. Then, according to the authors, “an almost 4-fold increase in minority accrual raises question of accrual disparity.” So, sites will only recruit minority patients when they have no choice?

But wait: the number of sites wasn't the same during the two periods, and start-up times were staggered. Adjusting for actual site time, the average minority accrual rate was 0.60 patients/site/month in the first part and 0.56 in the second. So the apparent 4-fold increase was entirely an artifact of bad math.

This would be horribly embarrassing were it not for the fact that bad math seems to be endemic in clinical trial enrollment. Failing to adjust for start-up time and number of sites is so routine that not doing it is grounds for a presentation.

The bottom line

What we need now is to rigorously (and prospectively) compare and measure accrual interventions. We have lots of candidate ideas, and there is no need for more retrospective studies, working groups, or opinion polls to speculate on which ones will work best. Where possible, accrual interventions should themselves be randomized to minimize confounding variables which prevent accurate assessment. Data needs to be uniformly and completely collected. In other words, the standards that we already use for clinical trials need to be applied to the enrollment measures we use to engage patients to participate in those trials.

This is not an optional consideration. It is an ethical obligation we have to cancer patients: we need to assure that we are doing all we can to maximize the rate at which we generate new evidence and test new therapies.

[Image credit: Logarithmic turtle accrual rates courtesy of Flikr user joleson.]

Wednesday, May 15, 2013

Placebos: Banned in Helsinki?

One of the unintended consequences of my (admittedly, somewhat impulsive) decision to name this blog is that I get a fair bit of traffic from Google: people searching for placebo-related information.

Some recent searches have been about the proposed new revisions to the Declaration of Helsinki, and how the new draft version will prohibit or restrict the use of placebo controls in clinical trials. This was a bit puzzling, given that the publicly-released draft revisions [PDF] didn't appear to substantially change the DoH's placebo section.

Much of the confusion appears to be caused by a couple sources. First, the popular Pharmalot blog (whose approach to critical analysis I've noted before as being ... well ... occasionally unenthusiastic) covered it thus:

The draft, which was released earlier this week, is designed to update a version that was adopted in 2008 and many of the changes focus on the use of placebos. For instance, placebos are only permitted when no proven intervention exists; patients will not be subject to any risk or there must be ‘compelling and sound methodological reasons’ for using a placebo or less effective treatment.

This isn't a good summary of the changes, since the “for instance” items are for the most part slight re-wordings from the 2008 version, which itself didn't change much from the version adopted in 2000.

To see what I mean, take a look at the change-tracked version of the placebo section:

The benefits, risks, burdens and effectiveness of a new intervention must be tested against those of the best current proven intervention(s), except in the following circumstances:

The use of placebo, or no ~~treatment~~ intervention is acceptable in studies where no ~~current~~ proven intervention exists; or

Where for compelling and scientifically sound methodological reasons the use of any intervention less effective than the best proven one, placebo or no treatment is necessary to determine the efficacy or safety of an intervention

and the patients who receive any intervention less effective than the best proven one, placebo or no treatment will not be subject to ~~any~~ additional risks of serious or irreversible harm as a result of not receiving the best proven intervention.

Extreme care must be taken to avoid abuse of this option.

Really, there is only one significant change to this section: the strengthening of the existing reference to “best proven intervention” in the first sentence. It was already there, but has now been added to sentences 3 and 4. This is a reference to the use of active (non-placebo) comparators that are not the “best proven” intervention.

So, ironically, the biggest change to the placebo section is not about placebos at all.

This is a bit unfortunate, because to me it subtracts from the overall clarity of the section, since it's no longer exclusively about placebo despite still being titled “Use of Placebo”. The DoH has been consistently criticized during previous rounds of revision for becoming progressively less organized and coherently structured, and it certainly reads like a rambling list of semi-related thoughts – a classic “document by committee”. This lack of structure and clarity certainly hurt the DoH's effectiveness in shaping the world's approach to ethical clinical research.

Even worse, the revisions continue to leave unresolved the very real divisions that exist in ethical beliefs about placebo use in trials. The really dramatic revision to the placebo section happened over a decade ago, with the 2000 revision. Those changes, which introduced much of the strict wording in the current version, were extremely controversial, and resulted in the issuance of an extraordinary “Note of Clarification” that effectively softened the new and inflexible language. The 2008 version absorbed the wording from the Note of Clarification, and the resulting document is now vague enough that it is interpreted quite differently in different countries. (For more on the revision history and controversy, see this comprehensive review.)

The 2013 revision could have been an opportunity to try again to build a consensus around placebo use. At the very least, it could have acknowledged and clarified the division of beliefs on the topic. Instead, it sticks to its ambiguous phrasing which will continue to support multiple conflicting interpretations. This does not serve the ends of assuring the ethical conduct of clinical trials.

Ezekiel Emmanuel has been a long-time critic of the DoH's lack of clarity and structure. Earlier this month, he published a compact but forceful review of the ways in which the Declaration has become weakened by its long series of revisions:

Over the years problems with, and objections to, the document have accumulated. I propose that there are nine distinct problems with the current version of the Declaration of Helsinki: it has an incoherent structure; it confuses medical care and research; it addresses the wrong audience; it makes extraneous ethical provisions; it includes contradictions; it contains unnecessary repetitions; it uses multiple and poor phrasings; it includes excessive details; and it makes unjustified, unethical recommendations.

Importantly, Emmanuel also includes a proposed revision and restructuring of the DoH. In his version, much of the current wording around placebo use is retained, but it is absorbed into the larger concept of “Scientific Validity”, which adds important context to the decision about how to decide on a comparator arm in general.

Here is Emmanuel’s suggested revision:

Scientific Validity: Research in biomedical and other sciences involving human participants must conform to generally accepted scientific principles, be based on a thorough knowledge of the scientific literature, other relevant sources of information, and suitable laboratory, and as necessary, animal experimentation. Research must be conducted in a manner that will produce reliable and valid data. To produce meaningful and valid data new interventions should be tested against the best current proven intervention. Sometimes it will be appropriate to test new interventions against placebo, or no treatment, when there is no current proven intervention or, where for compelling and scientifically sound methodological reasons the use of placebo is necessary to determine the efficacy and/or safety of an intervention and the patients who receive placebo, or no treatment, will not be subject to excessive risk or serious irreversible harm. This option should not be abused.

Here, the scientific rationale for the use of placebo is placed in the greater context of selecting a control arm, which is itself subservient to the ethical imperative to only conduct studies that are scientifically valid. One can quibble with the wording (I still have issues with the use of “best proven” interventions, which I think is much too undefined here, as it is in the DoH, and glosses over some significant problems), but structurally this is a lot stronger, and provides firmer grounding for ethical decision making.

Emanuel, E. (2013). Reconsidering the Declaration of Helsinki The Lancet, 381 (9877), 1532-1533 DOI: 10.1016/S0140-6736(13)60970-8

[Image: Extra-strength chill pill, modified by the author, based on an original image by Flikr user mirjoran.]