Tuesday, July 24, 2012

How Not to Report Clinical Trial Data: a Clear Example

I know it’s not even August yet, but I think we can close the nominations for "Worst Trial Metric of the Year".  The hands-down winner is Pharmalot, for the thoughtless publication of this article reviewing "Deaths During Clinical Trials" per year in India.  We’ll call it the Pharmalot Death Count, or PDC, and its easy to explain – it's just the total number of patients who died while enrolled in any clinical trial, regardless of cause, and reported as though it were an actual meaningful number.

(To make this even more execrable, Pharmalot actually calls this "Deaths attributed to clinical trials" in his opening sentence, although the actual data has exactly nothing to do with the attribution of the death.)

In fairness, Pharmalot is really only sharing the honors with a group of sensationalistic journalists in India who have jumped on these numbers.  But it has a much wider readership within the research community, and could have at least attempted to critically assess the data before repeating it (along with criticism from "experts").

The number of things wrong with this metric is a bit overwhelming.  I’m not even sure where to start.  Some of the obvious issues here:

1. No separation of trial-related versus non-trial-related.  Some effort is made to explain that there may be difficulty in determining whether a particular death was related to the study drug or not.  However, that obscures the fact that the PDC lumps together all deaths, whether they took an experimental medication or not. That means the PDC includes:
  • Patients in control arms receiving standard of care and/or placebo, who died during the course of their trial.
  • Patients whose deaths were entirely unrelated to their illness (eg, automobile accident victims)
2. No base rates.  When a raw death total is presented, a number of obvious questions should come to mind:  how many patients were in the trials?  How many deaths were there in patients with similar diseases who were not in trials?  The PDC doesn’t care about that kind of context

3. No sensitivity to trial design.  Many late-stage cancer clinical trials use Overall Survival (OS) as their primary endpoint – patients are literally in the trial until they die.  This isn’t considered unethical; it’s considered the gold standard of evidence in oncology.  If we ran shorter, less thorough trials, we could greatly reduce the PDC – would that be good for anyone?

Case Study: Zelboraf
FDA: "Highly effective, more personalized therapy"
PDC: "199 deaths attributed to Zelboraf trial!"
There is a fair body of evidence that participants in clinical trials fare about the same as (or possibly a bit better than) similar patients receiving standard of care therapy.  However, much of that evidence was accumulated in western countries: it is a fair question to ask if patients in India and other countries receive a similar benefit.  The PDC, however, adds nothing to our ability to answer that question.

So, for publicizing a metric that has zero utility, and using it to cast aspersions on the ethics of researchers, we congratulate Pharmalot and the PDC.

Thursday, July 19, 2012

Measuring Quality: Probably Not Easy


I am a bit delayed getting my latest post up.  I am writing up some thoughts on this recentstudy put out by ARCO, which suggests that the level of quality in clinical trials does not vary significantly across global regions.

The study has gotten some attention through ARCO’s press release (an interesting range of reactions: the PharmaTimes headline declares “Developingcountries up to scratch on trial data quality”, while Pharmalot’s headline, “WhatProblem With Emerging Markets Trial Data?”, betrays perhaps a touch more skepticism). 


And it’s a very worthwhile topic: much of the difficultly, unfortunately, revolves around agreeing on what we consider adequate metrics for data quality.  The study only really looks at one metric (query rates), but does an admirably job of trying to view that metric in a number of different ways.  (I wrote about another metric – protocol deviations – in a previous post on the relation of quality to site enrollment performance.)

I have run into some issues parsing the study results, however, and have a question in to the lead author.  I’ll withhold further comment until I head back and have had a chance to digest a bit more.

Sunday, July 15, 2012

Site Enrollment Performance: A Better View

Pretty much everyone involved in patient recruitment for clinical trials seems to agree that "metrics" are, in some general sense, really really important. The state of the industry, however, is a bit dismal, with very little evidence of effort to communicate data clearly and effectively. Today I’ll focus on the Site Enrollment histogram, a tried-but-not-very-true standby in every trial.

Consider this graphic, showing enrolled patients at each site. It came through on a weekly "Site Newsletter" for a trial I was working on:



I chose this histogram not because it’s particularly bad, but because it’s supremely typical. Don’t get me wrong ... it’s really bad, but the important thing here is that it looks pretty much exactly like every site enrollment histogram in every study I’ve ever worked on.

This is a wasted opportunity. Whether we look at per-site enrollment with internal teams to develop enrollment support plans, or share this data with our sites to inform and motivate them, a good chart is one of the best tools we have. To illustrate this, let’s look at a few examples of better ways to look at the data.

If you really must do a static site histogram, make it as clear and meaningful as possible. 

This chart improves on the standard histogram in a few important ways:


Stateful histo - click to enlarge

  1.  It looks better. This is not a minor point when part of our work is to engage sites and makes them feel like they are part of something important. Actually, this graph is made clearer and more appealing mostly by the removal of useless attributes (extraneous whitespace, background colors, and unhelpful labels).
  2. It adds patient disposition information. Many graphs – like the one at the beginning of this post – are vague about who is being counted. Does "enrolled" include patients currently being screened, or just those randomized? Interpretations will vary from reader to reader. Instead, this chart makes patient status an explicit variable, without adding to the complexity of the presentation. It also provides a bit of information about recent performance, by showing patients who have been consented but not yet fully screened.
  3. It ranks sites by their total contribution to the study, not by the letters in the investigator’s name. And that is one of the main reasons we like to share this information with our sites in the first place.
Find Opportunities for Alternate Visualizations
 
There are many other ways in which essentially the same data can be re-sliced or restructured to underscore particular trends or messages. Here are two that I look at frequently, and often find worth sharing.

Then versus Now

Tornado chart - click to enlarge

This tornado chart is an excellent way of showing site-level enrollment trajectory, with each sites prior (left) and subsequent (right) contributions separated out. This example spotlights activity over the past month, but for slower trials a larger timescale may be more appropriate. Also, how the data is sorted can be critical in the communication: this could have been ranked by total enrollment, but instead sorts first on most-recent screening, clearly showing who’s picked up, who’s dropped off, and who’s remained constant (both good and bad).

This is especially useful when looking at a major event (e.g., pre/post protocol amendment), or where enrollment is expected to have natural fluctuations (e.g., in seasonal conditions).

Net Patient Contribution

In many trials, site activation occurs in a more or less "rolling" fashion, with many sites not starting until later in the enrollment period. This makes simple enrollment histograms downright misleading, as they fail to differentiate sites by the length of time they’ve actually been able to enroll. Reporting enrollment rates (patients per site per month) is one straightforward way of compensating for this, but it has the unfortunate effect of showing extreme (and, most importantly, non-predictive), variance for sites that have not been enrolling for very long.

As a result, I prefer to measure each site in terms of its net contribution to enrollment, compared to what it was expected to do over the time it was open:
Net pt contribution - click to enlarge

To clarify this, consider an example: A study expects sites to screen 1 patient per month. Both Site A and Site B have failed to screen a single patient so far, but Site A has been active for 6 months, whereas Site B has only been active 1 month.

On an enrollment histogram, both sites would show up as tied at 0. However, Site A’s 0 is a lot more problematic – and predictive of future performance – than Site B’s 0. If I compare them to benchmark, then I show how many total screenings each site is below the study’s expectation: Site A is at -6, and Site B is only -1, a much clearer representation of current performance.

This graphic has the added advantage of showing how the study as a whole is doing. Comparing the total volume of positive to negative bars gives the viewer an immediate visceral sense of whether the study is above or below expectations.

The above are just 3 examples – there is a lot more that can be done with this data. What is most important is that we first stop and think about what we’re trying to communicate, and then design clear, informative, and attractive graphics to help us do that.