Thursday, December 20, 2012

All Your Site Are Belong To Us


'Competitive enrollment' is exactly that.

This is a graph I tend to show frequently to my clients – it shows the relative enrollment rates for two groups of sites in a clinical trial we'd been working on. The blue line is the aggregate rate of the 60-odd sites that attended our enrollment workshop, while the green line tracks enrollment for the 30 sites that did not attend the workshop. As a whole, the attendees were better enrollers that the non-attendees, but the performance of both groups was declining.

Happily, the workshop produced an immediate and dramatic increase in the enrollment rate of the sites who participated in it – they not only rebounded, but they began enrolling at a better rate than ever before. Those sites that chose not to attend the workshop became our control group, and showed no change in their performance.

The other day, I wrote about ENACCT's pilot program to improve enrollment. Five oncology research sites participated in an intensive, highly customized program to identify and address the issues that stood in the way of enrolling more patients.  The sites in general were highly enthused about the program, and felt it had a positive impact on the operations.

There was only one problem: enrollment didn't actually increase.

Here’s the data:

This raises an obvious question: how can we reconcile these disparate outcomes?

On the one hand, an intensive, multi-day, customized program showed no improvement in overall enrollment rates at the sites.

On the other, a one-day workshop with sixty sites (which addressed many of the same issues as the ENACCT pilot: communications, study awareness, site workflow, and patient relationships) resulted in and immediate and clear improvement in enrollment.

There are many possible answers to this question, but after a deeper dive into our own site data, I've become convinced that there is one primary driver at work: for all intents and purposes, site enrollment is a zero-sum game. Our workshop increased the accrual of patients into our study, but most of that increase came as a result of decreased enrollments in other studies at our sites.

Our workshop graph shows increased enrollment ... for one study. The ENACCT data is across all studies at each site. It stands to reason that if sites are already operating at or near their maximum capacity, then the only way to improve enrollment for your trial is to get the sites to care more about your trial than about other trials that they’re also participating in.

And that makes sense: many of the strategies and techniques that my team uses to increase enrollment are measurably effective, but there is no reason to believe that they result in permanent, structural changes to the sites we work with. We don’t redesign their internal processes; we simply work hard to make our sites like us and want to work with us, which results in higher enrollment. But only for our trials.

So the next time you see declining enrollment in one of your trials, your best bet is not that the patients have disappeared, but rather that your sites' attention has wandered elsewhere.


Tuesday, December 11, 2012

What (If Anything) Improves Site Enrollment Performance?

ENACCT has released its final report on the outcomes from the National Cancer Clinical Trials Pilot Breakthrough Collaborative (NCCTBC), a pilot program to systematically identify and implement better enrollment practices at five US clinical trial sites. Buried after the glowing testimonials and optimistic assessments is a grim bottom line: the pilot program didn't work.

Here are the monthly clinical trial accruals at each of the 5 sites. The dashed lines mark when the pilots were implemented:



4 of the 5 sites showed no discernible improvement. The one site that did show increasing enrollment appears to have been improving before any of the interventions kicked in.

This is a painful but important result for anyone involved in clinical research today, because the improvements put in place through the NCCTBC process were the product of an intensive, customized approach. Each site had 3 multi-day learning sessions to map out and test specific improvements to their internal communications and processes (a total of 52 hours of workshops). In addition, each site was provided tracking tools and assigned a coach to assist them with specific accrual issues.

That’s an extremely large investment of time and expertise for each site. If the results had been positive, it would have been difficult to project how NCCTBC could be scaled up to work at the thousands of research sites across the country. Unfortunately, we don’t even have that problem: the needle simple did not move.

While ENACCT plans a second round of pilot sites, I think we need to face a more sobering reality: we cannot squeeze more patients out of sites through training and process improvements. It is widely believed in the clinical research industry that sites are low-efficiency bottlenecks in the enrollment process. If we could just "fix" them, the thinking goes – streamline their workflow, improve their motivation – we could quickly improve the speed at which our trials complete. The data from the NCCTBC paints an entirely different picture, though. It shows us that even when we pour large amounts of time and effort into a tailored program of "evidence and practice-based changes", our enrollment ROI may be nonexistent.

I applaud the ENACCT team for this pilot, and especially for sharing the full monthly enrollment totals at each site. This data should cause clinical development teams everywhere to pause and reassess their beliefs about site enrollment performance and how to improve it.

Friday, November 16, 2012

The Accuracy of Patient Reported Diagnoses


Novelist Phillip Roth recently got embroiled in a small spat with the editors of Wikipedia regarding the background inspiration for one of his books.  After a colleague attempted to correct the entry for The Human Stain on Roth's behalf, he received the following reply from a Wikipedia editor:
I understand your point that the author is the greatest authority on their own work, but we require secondary sources.
Report: 0% of decapitees could
accurately recall their diagnosis
The editor's response, as exasperating as it was to Roth, parallels the prevailing beliefs in clinical research about the value and reliability of Patient Reported Outcomes (PROs). On the one hand, who knows the patient better than the patient? On the other hand, our SOPs require expert physician assessment and diagnosis -- we, too, usually require secondary sources.

While recent FDA guidance has helped to solidify our approaches to incorporating PROs into traditionally-structured clinical trials, there are still a number of open questions about how far we can go with relying exclusively on what patients tell us about their medical conditions.  These questions come to the forefront when we consider the potential of "direct to patient" clinical trials, such as the recently-discontinued REMOTE trial from Pfizer, a pilot study that attempted to assess the feasibility of conducting a clinical trial without the use of local physician investigators.

Among other questions, the REMOTE trial forces us to ask: without physician assessment, how do we know the patients we recruit even have the condition being studied? And if we need more detailed medical data, how easy will it be to obtain from their regular physicians? Unfortunately, that study ended due to lack of enrollment, and Pfizer has not been particularly communicative about any lessons learned.

 Luckily for the rest of us, at least one CRO, Quintiles, is taking steps to methodically address and provide data for some of these questions.  They are moving forward with what appears to be a small series of studies that assess the feasibility and accuracy of information collected in the direct-to-patient arena. Their first step is a small pilot study of 50 patients with self-reported gout, conducted by both Quintiles and Outcomes Health Information Services.  The two companies have jointly published their data in the open-access Journal of Medical Internet Research.

(Before getting into the article's content, let me just emphatically state: kudos to the Quintiles and Outcomes teams for submitting their work to peer review, and to publication in an open access journal. Our industry needs much, much more of this kind of collaboration and commitment to transparency.)

The study itself is fairly straightforward: 50 patients were enrolled (out of 1250 US patients who were already in a Quintiles patient database with self-reported gout) and asked to complete an online questionnaire as well as permit access to their medical records.

The twin goals of the study were to assess the feasibility of collecting the patients' existing medical records and to determine the accuracy of the patients' self-reported diagnosis of gout.

To obtain patients' medical records, the study team used a belt-and-suspenders approach: first, the patients provided an electronic release along with their physicians' contact information. Then, a paper release form was also mailed to the patients, to be used as backup if the electronic release was insufficient.

To me, the results from the attempt at obtaining the medical records is actually the most interesting part of the study, since this is going to be an issue in pretty much every DTP trial that's attempted. Although the numbers are obviously quite small, the results are at least mildly encouraging:

  • 38 Charts Received
    • 28 required electronic release only
    • 10 required paper release
  • 12 Charts Not Received
    • 8 no chart mailed in time
    • 2 physician required paper release, patient did not provide
    • 2 physician refused

If the electronic release had been used on its own, 28 charts (56%) would have been available. Adding the suspenders of a follow-up paper form increased the total to respectable 76%. The authors do not mention how aggressively they pursued obtaining the records from physicians, nor how long they waited before giving up, so it's difficult to determine how many of the 8 charts that went past the deadline could also potentially have been recovered.

Of the 38 charts received, 35 (92%) had direct confirmation of a gout diagnosis and 2 had indirect confirmation (a reference to gout medication).  Only 1 chart had no evidence for or against a diagnosis. So it is fair to conclude that these patients were highly reliable, at least insofar as their report of receiving a prior diagnosis of gout was concerned.

In some ways, though, this represents a pretty optimistic case. Most of these patients had been living with gout for many year, and "gout" is a relatively easy thing to remember.  Patients were not asked questions about the type of gout they had or any other details that might have been checked against their records.

The authors note that they "believe [this] to be the first direct-to-patient research study involving collection of patient-reported outcomes data and clinical information extracted from patient medical records." However, I think it's very worthwhile to bring up comparison with this study, published almost 20 years ago in the Annals of the Rheumatic Diseases.  In that (pre-internet) study, researchers mailed a survey to 472 patients who had visited a rheumatology clinic 6 months previously. They were therefore able to match all of the survey responses with an existing medical record, and compare the patients' self-reported diagnoses in much the same way as the current study.  Studying a more complex set of diseases (arthritis), the 1995 paper paints a more complex picture: patient accuracy varied considerably depending on their disease: from very accurate (100% for those suffering from ankylosing spondylitis, 90% for rheumatoid arthritis) to not very exact at all (about 50% for psoriatic and osteo arthritis).

Interestingly, the Quintiles/Outcomes paper references a larger ongoing study in rheumatoid arthritis as well, which may introduce some of the complexity seen in the 1995 research.

Overall, I think this pilot does exactly what it set out to do: it gives us a sense of how patients and physicians will react to this type of research, and helps us better refine approaches for larger-scale investigations. I look forward to hearing more from this team.


ResearchBlogging.org Cascade, E., Marr, P., Winslow, M., Burgess, A., & Nixon, M. (2012). Conducting Research on the Internet: Medical Record Data Integration with Patient-Reported Outcomes Journal of Medical Internet Research, 14 (5) DOI: 10.2196/jmir.2202



Also cited: I Rasooly, et al., Comparison of clinical and self reported diagnosis for rheumatology outpatients, Annals of the Rheumatic Diseases 1995 DOI:10.1136/ard.54.10.850

Image courtesy Flickr user stevekwandotcom.

Friday, October 12, 2012

The "Scandal" of "Untested" Generics


I am in the process of writing up a review of this rather terrible Forbes piece on the FDA recall of one manufacturer's version of generic 300 mg bupropion XL. However, that's going to take a while, so I thought I'd quickly cover just one of the points brought up there, since it seems to be causing a lot of confusion.

Forbes is shocked, SHOCKED to learn that things
 are happening the same way they always have:
call Congress at once!
The FDA’s review of the recall notes that when the generic was approved, only the 150 mg version was tested for bioequivalence in humans. The 300 mg version was approved based upon the 150 mg data as well as detailed information about the manufacturing and composition of both versions.

A number of people expressed surprise about this – they seemed to genuinely not be aware that a drug approval could happen in this way. The Forbes article stated that this was entirely inappropriate and worthy of Congressional investigation.

In fact, many strengths of generic drugs do not undergo in vivo bioequivalence and bioavailability testing as part of their review and approval. This is true in both the US and Europe. Here is a brief rundown of when and why such testing is waived, and why such waivers are neither new, nor shocking, nor unethical.

Title 21, Part 320 of the US Code of Federal Regulations is the regulatory foundation regarding bioequivalence testing in drugs.  Section 22 deals specifically with conditions where human testing should be waived. It is important to note that these regulations aren't new, and the laws that they're based on aren't new either (in fact, the federal law is 20 years old, and was last updated 10 years ago).

By far the most common waiver is for lower dosage strengths. When a drug exists in many approved dosages, generally the highest dose is subject to human bioequivalence testing and the lower doses are approved based on the high-dose results supplemented by in vitro testing.

However, when higher doses carry risks of toxicity, the situation can be reversed, out of ethical concerns for the welfare of test subjects. So, for example, current FDA guidance for amiodarone – a powerful antiarrhythmic drug with lots of side effects – is that the maximum “safe” dose of 200 mg should be tested in humans, and that 100 mg, 300 mg, and 400 mg dosage formulations will be approved if the manufacturer also establishes “acceptable in-vitro dissolution testing of all strengths, and … proportional similarity of the formulations across all strengths”.

That last part is critically important: the generic manufacturer must submit additional evidence about how the doses work in vitro, as well as keep the proportions of inactive ingredients constant. It is this combination of in vivo bioequivalence, in vitro testing, and manufacturing controls that supports a sound scientific decision to approve the generic at various doses.

In fact, certain drugs are so toxic – most chemotherapies, for example – that performing a bioequivalence test in healthy humans in patently unethical. In many of those cases, generic approval is granted on the basis of formulation chemistry alone. For example, generic paclitaxel is waived from human testing (here is a waiver from 2001 – again demonstrating that there’s nothing terribly shocking or new about this process).

In the case of bupropion, FDA had significant concerns about the risk of seizures at the 300 mg dose level. Similar to the amiodarone example above, they issued guidance providing for a waiver of the higher dosage, but only based upon the combination of in vivo data from the 150 mg dose, in vitro testing, and manufacturing controls.

You may not agree with the current system, and there may be room for improvement, but you cannot claim that it is new, unusual, or requiring congressional inquiry. It’s based on federal law, with significant scientific and ethical underpinnings.

Further reading: FDA Guidance for Industry: Bioavailability and Bioequivalence Studies for Orally Administered Drug Products — General Considerations

Thursday, October 11, 2012

TransCelerate and CDISC: The Relationship Explained


Updating my post from last month about the launch announcement for TransCelerate BioPharma, a nonprofit entity funded by 10 large pharmaceutical companies to “bring new medicines to patients faster”: one of the areas I had some concern about was in the new company's move into the “development of clinical data standards”.

How about we transcelerate
this website a bit?
Some much-needed clarification has come by way of Wayne Kubick, the CTO of CDISC. In an article in Applied Clinical Trials, he lays out the relationship in a bit more detail:
TransCelerate has been working closely with CDISC for several months to see how they can help us move more quickly in the development of therapeutic area data standards.  Specifically, they are working to provide CDISC with knowledgeable staff to help us plan for and develop data standards for more than 55 therapeutic areas over the next five years.
And then again:
But the important thing to realize is that TransCelerate intends to help CDISC achieve its mission to develop therapeutic area data standards more rapidly by giving us greater access to skilled volunteers to contribute to standards development projects.   
So we have clarification on at least one point: TransCelerate will donate some level of additional skilled manpower to CDISC-led initiatives.

That’s a good thing, I assume. Kubick doesn't mention it, but I would venture to guess that “more skilled volunteers” is at or near the top of CDISC's wish list.

But it raises the question: why TransCelerate? Couldn't the 10 member companies have contributed this employee time already? Did we really need a new entity to organize a group of fresh volunteers? And if we did somehow need a coordinating entity to make this happen, why not use an existing group – one with, say, a broader level of support across the industry, such as PhRMA?

The promise of a group like TransCelerate is intriguing. The executional challenges, however, are enormous: I think it will be under constant pressure to move away from meaningful but very difficult work towards supporting more symbolic and easy victories.

Tuesday, October 2, 2012

Decluttering the Dashboard


It’s Gresham’s Law for clinical trial metrics: Bad data drives out good. Here are 4 steps you can take to fix it.

Many years ago, when I was working in the world of technology startups, one “serial entrepreneur” told me about a technique he had used when raising investor capital for his new database firm:  since his company marketed itself as having cutting-edge computing algorithms, he went out and purchased a bunch of small, flashing LED lights and simply glued them onto the company’s servers.  When the venture capital folks came out for due diligence meetings, they were provided a dramatic view into the darkened server room, brilliantly lit up by the servers’ energetic flashing. It was the highlight of the visit, and got everyone’s investment enthusiasm ratcheted up a notch.

The clinical trials dashboard is a candy store: bright, vivid,
attractive ... and devoid of nutritional value.
I was reminded of that story at a recent industry conference, when I naively walked into a seminar on “advanced analytics” only to find I was being treated to an extended product demo. In this case, a representative from one of the large CROs was showing off the dashboard for their clinical trials study management system.

And an impressive system it was, chock full of bubble charts and histograms and sliders.  For a moment, I felt like a kid in a candy store.  So much great stuff ... how to choose?

Then the presenter told a story: on a recent trial, a data manager in Italy, reviewing the analytics dashboard, alerted the study team to the fact that there was an enrollment imbalance in Japan, with one site enrolling all of the patients in that country.  This was presented as a success story for the system: it linked up disparate teams across the globe to improve study quality.

But to me, this was a small horror story: the dashboard had gotten so cluttered that key performance issues were being completely missed by the core operations team. The fact that a distant data manager had caught the issue was a lucky break, certainly, but one that should have set off alarm bells about how important signals were being overwhelmed by the noise of charts and dials and “advanced visualizations”.

Swamped with high-precision trivia
I do not need to single out any one system or vendor here: this is a pervasive problem. In our rush to provide “robust analytic solutions”, our industry has massively overengineered its reporting interfaces. Every dashboard I've had a chance to review – and I've seen a lot of them – contain numerous instances of vividly-colored charts crowding out one another, with minimal sense of differentiating the significant from the tangential.

It’s Gresham’s Law for clinical trial metrics: Bad data drives out good. Bad data – samples sliced so thin they’ve lost significance, histograms of marginal utility made “interesting” (and nearly unreadable) by 3-D rendering, performance grades that have never been properly validated. Bad data is plentiful and much, much easier to obtain than good data.

So what can we do? Here are 4 initial steps to decluttering the dashboard:

1. Abandon “Actionable Analytics”
Everybody today sells their analytics as “actionable” [including, to be fair, even one company’s website that the author himself may be guilty of drafting]. The problem though is that any piece of data – no matter how tenuous and insubstantial -- can be made actionable. We can always think of some situation where an action might be influenced by it, so we decide to keep it. As a result, we end up swamped with high-precision trivia (Dr. Smith is enrolling at the 82nd percentile among UK sites!) that do not influence important decisions but compete for our attention. We need to stop reporting data simply because it’s there and we can report it.

2. Identify Key Decisions First
 The above process (which seems pretty standard nowadays) is backwards. We look at the data we have, and ask ourselves whether it’s useful. Instead, we need to follow a more disciplined process of first asking ourselves what decisions we need to make, and when we need to make them. For example:

  • When is the earliest we will consider deactivating a site due to non-enrollment?
  • On what schedule, and for which reasons, will senior management contact individual sites?
  • At what threshold will imbalances in safety data trigger more thorough investigation?

Every trial will have different answers to these questions. Therefore, the data collected and displayed will also need to be different. It is important to invest time and effort to identify critical benchmarks and decision points, specific to the needs of the study at hand, before building out the dashboard.

3. Recognize and Respect Context
As some of the questions about make clear, many important decisions are time-dependent.  Often, determining when you need to know something is every bit as important as determining what you want to know. Too many dashboards keep data permanently anchored over the course of the entire trial even though it's only useful during a certain window. For example, a chart showing site activation progress compared to benchmarks should no longer be competing for attention on the front of a dashboard after all sites are up and running – it will still be important information for the next trial, but for managing this trial now, it should no longer be something the entire team reviews regularly.

In addition to changing over time, dashboards should be thoughtfully tailored to major audiences.  If the protocol manager, medical monitor, CRAs, data managers, and senior executives are all looking at the same dashboard, then it’s a dead certainty that many users are viewing information that is not critical to their job function. While it isn't always necessary to develop a unique topline view for every user, it is worthwhile to identify the 3 or 4 major user types, and provide them with their own dashboards (so the person responsible for tracking enrollment in Japan is in a position to immediately see an imbalance).

4. Give your Data Depth
Many people – myself included – are reluctant to part with any data. We want more information about study performance, not less. While this isn't a bad thing to want, it does contribute to the tendency to cram as much as possible into the dashboard.

The solution is not to get rid of useful data, but to bury it. Many reporting systems have the ability to drill down into multiple layers of information: this capability should be thoughtfully (but aggressively!) used to deprioritize all of your useful-but-not-critical data, moving it off the dashboard and into secondary pages.

Bottom Line
The good news is that access to operational data is becoming easier to aggregate and monitor every day. The bad news is that our current systems are not designed to handle the flood of new information, and instead have become choked with visually-appealing-but-insubstantial chart candy. If we want to have any hope of getting a decent return on our investment from these systems, we need to take a couple steps back and determine: what's our operational strategy, and who needs what data, when, in order to successfully execute against it?


[Photo credit: candy store from flikr user msgolightly.]

Tuesday, September 25, 2012

What We Can Anticipate from TransCelerate


TransCelerate: Pharma's great kumbaya moment?
Last week, 10 of the largest pharmaceutical companies caused quite a hullaballoo in the research world with their announcement that they were anteing up to form a new nonprofit entity “to identify and solve common drug development challenges with the end goals of improving the quality of clinical studies and bringing new medicines to patients faster”. The somewhat-awkwardly-named TransCelerate BioPharma immediately got an enthusiastic reception from industry watchers and participants, mainly due to the perception that it was well poised to attack some of the systemic causes of delays and cost overruns that plague clinical trials today.

I myself was caught up in the breathless excitement of the moment, immediately tweeting after reading the initial report:

 Over the past few days, though, I've had time to re-read and think more about the launch announcement, and dial down my enthusiasm considerably.  I still think it’s a worthwhile effort, but it’s probably not fair to expect anything that fundamentally changes much in the way of current trial execution.

Mostly, I’m surprised by the specific goals selected, which seem for the most part either tangential to the real issues in modern drug development or stepping into areas where an all-big-pharma committee isn’t the best tool for the job. I’m also very concerned that a consortium like this would launch without a clearly-articulated vision of how it fits in with, and adds to, the ongoing work of other key players – the press release is loaded with positive, but extremely vague, wording about how TransCelerate will work with, but be different from, groups such as the CTTI and CDISC. The new organization also appears to have no formal relationship with any CRO organizations.  Given the crucial and deeply embedded nature of CROs in today’s research, this is not a detail to be worked out later; it is a vital necessity if any worthwhile progress is to be made.

Regarding the group’s goals, here is what their PR had to say:
Five projects have been selected by the group for funding and development, including: development of a shared user interface for investigator site portals, mutual recognition of study site qualification and training, development of risk-based site monitoring approach and standards, development of clinical data standards, and establishment of a comparator drug supply model.
Let’s take these five projects one by one, to try to get a better picture of TransCelerate’s potential impact:

1. Development of a shared user interface for investigator site portals

Depending on how it’s implemented, the impact of this could range from “mildly useful” to “mildly irksome”. Sure, I hear investigators and coordinators complain frequently about all the different accounts they have to keep track of, so having a single front door to multiple sponsor sites would be a relief. However, I don’t think that the problem of too many usernames cracks anyone’s “top 20 things wrong with clinical trial execution” list – it’s a trivial detail. Aggravating, but trivial.

Worse, if you do it wrong and develop a clunky interface, you’ll get a lot more grumbling about making life harder at the research site. And I think there’s a high risk of that, given that this is in effect software development by committee – and the committee is a bunch of companies that do not actually specialize in software development.

In reality, the best answer to this is probably a lot simpler than we imagine: if we had a neutral, independent body (such as the ACRP) set up a single sign-on (SSO) registry for investigators and coordinators, then all sponsors, CROs, and IVRS/IWRS/CDMS can simply set themselves up as service providers. (This works in the same way that many people today can log into disparate websites using their existing Google or Facebook accounts.)  TransCelerate might do better sponsoring and promoting an external standard than trying to develop an entirely new platform of its own.

2. Mutual recognition of study site qualification and training

This is an excellent step forward. It’s also squarely in the realm of “ideas so obvious we could have done them 10 years ago”. Forcing site personnel to attend multiple iterations of the same training seminars simply to ensure that you’ve collected enough binders full of completion certificates is a sad CYA exercise with no practical benefit to anyone.

This will hopefully re-establish some goodwill with investigators. However, it’s important to note that it’s pretty much a symbolic act in terms of efficiency and cost savings. Nothing wrong with that – heaven knows we need some relationship wins with our increasingly-disillusioned sites – but let’s not go crazy thinking that the represents a real cause of wasted time or money. In fact, it’s pretty clear that one of the reasons we’ve lived with the current site-unfriendly system for so long is that it didn’t really cost us anything to do so.

(It’s also worth pointing out that more than a few biotechs have already figured out, usually with CRO help, how to ensure that site personnel are properly trained and qualified without subjecting them to additional rounds of training.)

3. Development of risk-based site monitoring approach and standards

The consensus belief and hope is that risk-based monitoring is the future of clinical trials. Ever since FDA’s draft guidance on the topic hit the street last year, it’s been front and center at every industry event. It will, unquestionably, lead to cost savings (although some of those savings will hopefully be reinvested into more extensive centralized monitoring).  It will not necessarily shave a significant amount of time off the trials, since in many trials getting monitors out to sites to do SDV is not a rate-limiting factor, but it should still at the very least result in better data at lower cost, and that’s clearly a good thing.

So, the big question for me is: if we’re all moving in this direction already, do we need a new, pharma-only consortium to develop an “approach” to risk-based monitoring?

 First and foremost, this is a senseless conversation to have without the active involvement and leadership of CROs: in many cases, they understand the front-line issues in data verification and management far better than their pharma clients.  The fact that TransCelerate launched without a clear relationship with CROs and database management vendors is a troubling sign that it isn’t poised to make a true contribution to this area.

In a worst-case scenario, TransCelerate may actually delay adoption of risk-based monitoring among its member companies, as they may decide to hold off on implementation until standards have been drafted, circulated, vetted, re-drafted, and (presumably, eventually) approved by all 10 companies. And it will probably turn out that the approaches used will need to vary by patient risk and therapeutic area anyway, making a common, generic approach less than useful.

Finally, the notion that monitoring approaches require some kind of industry-wide “standardization” is extremely debatable. Normally, we work to standardize processes when we run into a lot of practical interoperability issues – that’s why we all have the same electric outlets in our homes, but not necessarily the same AC adaptors for our small devices.  It would be nice if all cell phone manufacturers could agree on a common standard plug, but the total savings from that standard would be small compared to the costs of defining and implementing it.  That’s the same with monitoring: each sponsor and each CRO have a slightly different flavor of monitoring, but the costs of adapting to any one approach for any given trial are really quite small.

Risk-based monitoring is great. If TransCelerate gets some of the credit for its eventual adoption, that’s fine, but I think the adoption is happening anyway, and TransCelerate may not be much help in reality.

4. Development of clinical data standards

This is by far the most baffling inclusion in this list. What happened to CDISC? What is CDISC not doing right that TransCelerate could possibly improve?

In an interview with Matthew Herper at Forbes, TransCelerate’s Interim CEO expands a bit on this point:
“Why do some [companies] record that male is a 0 and female is a 1, and others use 1 and 0, and others use M and F. Where is there any competitive advantage to doing that?” says Neil. “We do 38% of the clinical trials but 70% of the [spending on them]. IF we were to come together and try to define some of these standards it would be an enabler for efficiencies for everyone.”
It’s really worth noting that the first part of that quote has nothing to do with the second part. If I could wave a magic wand and instantly standardize all companies’ gender reporting, I would not have reduced clinical trial expenditures by 0.01%. Even if we extend this to lots of other data elements, we’re still not talking about a significant source of costs or time.

Here’s another way of looking at it: those companies that are conducting the other 62% of trials but are only responsible for 30% of the spending – how did they do it, since they certainly haven’t gotten together to agree on a standard format for gender coding?

But the main problem here is that TransCelerate is encroaching on the work of a respected, popular, and useful initiative – CDISC – without clearly explaining how it will complement and assist that initiative. Neil’s quote almost seems to suggest that he plans on supplanting CDISC altogether.  I don’t think that was the intent, but there’s no rational reason to expect TransCelerate to offer substantive improvement in this area, either.

5. Establishment of a comparator drug supply model

This is an area that I don’t have much direct experience in, so it’s difficult to estimate what impact TransCelerate will have. I can say, anecdotally, that over the past 10 years, exactly zero clinical trials I’ve been involved with have had significant issues with comparator drug supply. But, admittedly, that’s quite possibly a very unrepresentative sample of pharmaceutical clinical trials.

I would certainly be curious to hear some opinions about this project. I assume it’s a somewhat larger problem in Europe than in the US, given both their multiple jurisdictions and their stronger aversion to placebo control. I really can’t imagine that inefficiencies in acquiring comparator drugs (most of which are generic, and so not directly produced by TransCelerate’s members) represent a major opportunity to save time and money.

Conclusion

It’s important to note that everything above is based on very limited information at this point. The transcelerate.com website is still “under construction”, so I am only reacting to the press release and accompanying quotes. However, it is difficult to imagine at this point that TransCelerate’s current agenda will have more than an extremely modest impact on current clinical trials.  At best, it appears that it may identify some areas to cut some costs, though this is mostly through the adoption of risk-based monitoring, which should happen whether TransCelerate exists or not.

I’ll remain a fan of TransCelerate, and will follow its progress with great interest in the hopes that it outperforms my expectations. However, it would do us all well to recognize that TransCelerate probably isn’t going to change things very dramatically -- the many systemic problems that add to the time and cost of clinical trials today will still be with us, and we need to continue to work hard to find better paths forward.

[Update 10-Oct-2012: Wayne Kubick, the CTO of CDISC, has posted a response with some additional details around cooperation between TransCelerate and CDISC around point 4 above.]

Mayday! Mayday! Photo credit: "Wheatley Maypole Dance 2008" from flikr user net_efekt.

Friday, September 21, 2012

Trials in Alzheimer's Disease: The Long Road Ahead

Placebo Control is going purple today in support of Alzheimer’s Action Day.

A couple of clinical trial related thoughts on the ongoing struggle to find even one effective therapy (currently-approved drugs show some ability to slow the progression of AD, but not to effectively stop, much less reverse it):
  • The headlines so far this year have been dominated by the high-profile and incredibly expensive failures of bapineuzumab and solanezumab. However, these two are just the most recent of a long series of failures: a recent industry report tallies 101 investigational drugs that that have failed clinical trials or been suspended in development since 1998, against only 3 successes, an astonishing and painful 34:1 failure rate.

  • While we are big fans of the Alzhemier’s Association (just down the street from Placebo HQ here in Chicago) and the Alzheimer’s Foundation of America, it’s important to stress that the single most important contribution that patients and caregivers can make is to get involved in a clinical trial. That same report lists 93 new treatments currently being evaluated.  As of today, the US clinical trials registry lists 124 open trials for AD.  Many of these studies only require a few hundred participants, so each individual decision to enroll is important and immediately visible.

  • While all research is important, I want to single out the phenomenal work being done by ADNI, the Alzheimer’s Disease Neuroimaging Initiative. This is a public/private partnership that is
    collecting a vast amount of data – blood, cerebrospinal fluid, MRIs, and PET scans – on hundreds of AD patients and matched controls. Best of all, all of the data collected is published in a free, public database hosted by UCLA. Additional funding has recently led to the development of the ADNI-2 study, which will enroll 550 more participants.
Without a doubt, finding and testing effective medications for Alzheimer's Disease is going to take many more years of hard, frustrating work. It will be a path littered with many more failures and therapeutic dead-ends. Today's a good day to stop and recognize that fact, and strengthen our resolve to work together to end this disease.

Tuesday, September 18, 2012

Delivering the Placebic Payload


Two recent articles on placebo effects caught my attention. Although they come to the topic from very different angles, they both bear on the psychological mechanisms by which the placebo effect delivers its therapeutic payload, so it seems worthwhile to look at them together.
Placebo delivery: there's got to be a better way!

The first item is a write up of 2 small studies, Nonconscious activation of placebo and nocebo pain responses. (The article is behind a paywall at PNAS: if you can’t access it you can read this nice synopsis on Inkfish, or the press release issued by Beth Israel Deaconess (which includes bonus overhyping of the study’s impact by the authors).)

The studies’ premises were pretty straightforward: placebo effects are (at least in part) caused by conditioned responses. In addition, psychologists have demonstrated in a number of studies that many types of conditioned responses can be triggered subliminally.  Therefore, it might be possible, under certain circumstances, to elicit placebo/nocebo responses with nothing but subliminal stimuli.

And that, in effect, is what the studies demonstrate.  The first showed a placebo effect in patients who had been trained to associate various pain levels with pictures of specific faces. The second study elicited a (somewhat attenuated) placebo response even when those pictures were shown for a mere 12 milliseconds – below the threshold of conscious recognition. This gives us some preliminary evidence that placebo effects can be triggered through entirely subconscious mental processes.

Or does it? There seems to me to be some serious difficulties in making the leap from this highly-controlled lab experiment to the actual workings of placebos in clinical practice. First and foremost: to elicit subconscious effects, these experiments had to first provide quite a significant “pretreatment” of conscious, unambiguous conditioning to associate certain pain levels with specific images: 50 pain jolts in about 15 minutes.  Even then, the experimenters still felt the need to re-apply the explicit conditioning in 10% of the test cases, “to prevent extinction”.  This raises the obvious question: if even an intensive, explicit conditioning sequence can wear off that quickly, how are we to believe that a similar mechanism is acting in everyday clinical encounters, which are not so frequent and so explicit? The authors don’t seem to see an issue here, as they write:
Our results thereby translate the investigation of nonconscious effects to the clinical realm, by suggesting that health-related responses can be triggered by cues that are not consciously perceived, not only for pain … but also for other medical problems with demonstrated placebo effects, e.g., asthma, depression, and irritable bowel syndrome. Understanding the role of nonconscious processes in placebo/nocebo opens unique possibilities of enhancing clinical care by attending to the impact of nonconscious cues conveyed during the therapeutic encounter and improving therapeutic decisions.
So, the clinical relevance for these findings depends on how much you believe that precisely repeated blasts of pain faithfully replicate the effects of physician/patient interactions. I do not think I am being terribly skeptical when I say that I think clinical interactions are usually shorter and involve a lot more ambiguity – I am not even sure that this is a good model for placebo analgesia, and it certainly can’t be considered to have an lot of explanatory explanations for placebo effects in, eg, depression trials.

…Which brings me to the second article, a very different creature altogether.  It’s a blog post by Dike Drummond entitled Can digital medicine have a placebo effect? He actually comes very close to the study authors’ position in terms of ascribing placebo effects to subconscious processes:
The healing can occur without outside assistance — as the placebo effect in drug studies shows — or it can augment whatever medication or procedure you might also prescribe.  I believe it is the human qualities of attention and caring that trigger the placebo effect. These exist parallel to the provider’s ability to diagnose and select an appropriate medical treatment.
You can arrive at the correct diagnosis and treatment and not trigger a placebo effect. You can fail to make eye contact, write out a prescription, hand it to the patient and walk out the door.  Right answer — no placebo effect.  Your skills as a placebologist rely on the ability to create the expectation of healing in the patient. This is most definitely part of the art of medicine.
I will disagree a bit with Drummond on one point: if we could extinguish placebo effects merely by avoiding eye contact, or engaging in similar unsociable behavior, then we would see greatly reduced placebo effects in most clinical trials, since most sponsors do try to implement strategies to reduce those effects. In fact, there is some evidence that placebo effects are increasing in some trials. (Which, tangentially, makes me ask why pharmaceutical companies keep paying “expert consultants” to conduct training seminars on how to eliminate placebo effects … but that’s a rant for another day.)

Drummond ponders whether new technologies will be able to elicit placebo responses in patients, even in the complete absence of human-to-human interaction. I think the answer is “probably, somewhat”. We certainly have some evidence that physicians can increase placebo effects through explicit priming; it would seem logical that some of that work could be done by an iPad. Also, the part of the placebo effect that is patient-driven -- fed by their preexisting hopes and expectations – would seem to be transferrable to a non-personal interaction (after all, patients already derive placebic benefit from homeopathic and other ineffective over-the-counter cures with no physician, and minimal human, input).

The bottom line, I think, is this: we oversimplify the situation when we talk about “the” placebo effect. Placebo response in patients is a complex cluster of mechanisms, some or all of which are at play in each individual reaction. On the patient’s side, subconscious hope, conscious expectations, and learned associations are all in play, and may work with or against each other. The physician’s beliefs, transmitted through overt priming or subtle signals, can also work for or against the total placebo effect. There is even good evidence that placebo analgesia is produced through multiple distinct biochemical pathways, so proposing a single simple model to cover all placebo responses will be doomed to failure.

The consequence for clinical trialists? I do not think we need to start fretting over subliminal cues and secret subconscious signaling, but we do need to develop a more comprehensive method of measuring the impact of multiple environmental and patient factors in predicting response. The best way to accomplish this may be to implement prospective studies in parallel with existing treatment trials to get a clearer real-world picture of placebo response in action.

[Image: "Extraction of the Stone of Folly", Hieronymus Bosch, by way of Wikimedia Commons]

ResearchBlogging.org Karin B. Jensen, Ted J. Kaptchuk, Irving Kirsch, Jacqueline Raicek, Kara M. Lindstrom, Chantal Berna, Randy L. Gollub, Martin Ingvar, & Jian Kong (2012). Nonconscious activation of placebo and nocebo pain responses PNAS DOI: 10.1073/pnas.1202056109

Friday, September 14, 2012

Clinical trials: recent reading recommendations

My recommended reading list -- highlights from the past week:


Absolute required reading for anyone who designs protocols or is engaged in recruiting patients into clinical trials: Susan Guber writes eloquently about her experiences as a participant in cancer clinical trials.
New York Times Well Blog: The Trials of Cancer Trials
Today's #FDAFridayPhoto features Harvey
Wiley, leader of the famed FDA "Poison Squad".

The popular press in India continues to be disingenuous and exploitative in its coverage of clinical trial deaths in that country. (My previous thoughts on that are here.) Kiran Mazumdar-Shaw, an industry leader, has put together an intelligent and articulate antidote.
The Economic Times: Need a rational view on clinical trials


Rahlen Gossen exhibits mastery of the understatement: “Though the Facebook Insights dashboard is a great place to start, it has a few significant disadvantages.” She also provides a good overview of the most common pitfalls you’ll encounter when you try to get good metrics out of your Facebook campaign. 


I have not had a chance to watch it yet, but I’m excited to see that theHeart.org has just posted a 7-part video editorial series by Yale’s Harlan Krumholz and Duke Stanford’s Bob Harrington on “a frank discussion on the controversies in the world of clinical trials”. 

Monday, August 27, 2012

"Guinea Pigs" on CBS is Going to be Super Great, I Can Just Tell


An open letter to Mad Men producer/writer Dahvi Waller

Dear Dahvi,

I just wanted to drop you a quick note of congratulations when I heard through the grapevine that CBS has signed you on to do a pilot episode of your new medical drama, Guinea Pigs (well actually, I heard it from the Hollywood Reporter; the grapevine doesn’t tell me squat). According to the news item,
The drama centers on group of trailblazing doctors who run clinical trials at a hospital in Philadelphia. The twist: The trials are risky, and the guinea pigs are human.
Probably just like this, but
with a bigger body count.
(Sidenote: that’s quite the twist there! For a minute, I thought this was going to be the first ever rodent-based prime time series!)

I don’t want to take up too much of your time. I’m sure you’re extremely busy with lots of critical casting decisions, like: will the Evil Big Pharma character be a blonde, beautiful-but-treacherous Ice Queen type in her early 30’s, or an expensively-suited, handsome-but-treacherous Gordon Gekko type in his early 60’s? (My advice: Don’t settle!  Use both! Viewers of all ages can love to hate the pharmaceutical industry!)

About that name, by the way: great choice! I’m really glad you didn’t overthink that one. A good writer should go with her gut and pick the first easy stereotype that pops into her head. (Because the head is never closer to the gut then when it’s jammed firmly up … but I don’t have to explain anatomy to you! You write a medical drama for television!)

I’m sure the couple-three million Americans who enroll in clinical trials each year will totally relate to your calling them guinea pigs. In our industry, we call them heroes, but that’s just corny, right? Real heroes on TV are people with magic powers, not people who contribute to the advancement of medicine.

Anyway, I’m just really excited because our industry is just so, well … boring! We’re so fixated on data collection regulations and safety monitoring and ethics committee reviews and yada yada yada – ugh! Did you know we waste 5 to 10 years on this stuff, painstakingly bringing drugs through multiple graduated phases of testing in order to produce a mountain of data (sometimes running over 100,000 pages long) for the FDA to review?

Dahvi Waller: bringing CSI
to clinical research
I’m sure you’ll be giving us the full CSI-meets-Constant-Gardener treatment, though, and it will all seem so incredibly easy that your viewers will wonder what the hell is taking us so long to make these great new medicines. (Good mid-season plot point: we have the cure for most diseases already, but they’ve been suppressed by a massive conspiracy of sleazy corporations, corrupt politicians, and inept bureaucrats!)

Anyway, best of luck to you! I can't wait to see how accurately and respectfully you treat the work of the research biologists and chemists, physician investigators, nurses, study coordinators, monitors, reviewers, auditors, and patient volunteers guinea pigs who are working hard to ensure the next generation of medicines are safe and effective.  What can go wrong? It's television!




Wednesday, August 22, 2012

The Case against Randomized Trials is, Fittingly, Anecdotal


I have a lot of respect for Eric Topol, and am a huge fan of his ongoing work to bring new mobile technology to benefit patients.

The Trial of the Future
However, I am simply baffled by this short video he recently posted on his Medscape blog. In it, he argues against the continued use of randomized controlled trials (RCTs) to provide evidence for or against new drugs.

His argument for this is two anecdotes: one negative, one positive. The negative anecdote is about the recently approved drug for melanoma, Zelboraf:
Well, that's great if one can do [RCTs], but often we're talking about needing thousands, if not tens of thousands, of patients for these types of clinical trials. And things are changing so fast with respect to medicine and, for example, genomically guided interventions that it's going to become increasingly difficult to justify these very large clinical trials. 
For example, there was a drug trial for melanoma and the mutation of BRAF, which is the gene that is found in about 60% of people with malignant melanoma. When that trial was done, there was a placebo control, and there was a big ethical charge asking whether it is justifiable to have a body count. This was a matched drug for the biology underpinning metastatic melanoma, which is essentially a fatal condition within 1 year, and researchers were giving some individuals a placebo.
First and foremost, this is simply factually incorrect on a couple extremely important points.

  1. Zelboraf was not approved based on any placebo-controlled trials. The phase 1 and phase 2 trials were both single-arm, open label studies. The only phase 3 trial run before FDA approval used dacarbazine in the comparator arm. In fact, of the 34 trials currently listed for Zelboraf on ClinicalTrials.gov, only one has a placebo control: it’s an adjuvant trial for patients whose melanoma has been completely resected, where no treatment may very well be the best option.
  2. The Zelboraf trials are not an example of “needing thousands, if not tens of thousands, of patients” for approval. The phase 3 trial enrolled 675 patients. Even adding the phase 1 and 2 trials doesn’t get us to 1000 patients.

Correcting these details take a lot away from the power of this single drug to be a good example of why we should stop using “the sanctimonious [sic] randomized, placebo-controlled clinical trial”.

The second anecdote is about a novel Alzheimer’s Disease candidate:
A remarkable example of a trial of the future was announced in May. For this trial, the National Institutes of Health is working with [Banner Alzheimer's Institute] in Arizona, the University of Antioquia in Colombia, and Genentech to have a specific mutation studied in a large extended family living in the country of Colombia in South America. There is a family of 8000 individuals who have the so-called Paisa mutation, a presenilin gene mutation, which results in every member of this family developing dementia in their 40s. 
Researchers will be testing a drug that binds amyloid, a monoclonal antibody, in just 300 family members. They're not following these patients out to the point of where they get dementia. Instead, they are using surrogate markers to see whether or not the process of developing Alzheimer's can be blocked using this drug. This is an exciting way in which we can study treatments that can potentially prevent Alzheimer's in a very well-demarcated, very restricted population with a genetic defect, and then branch out to a much broader population of people who are at risk for Alzheimer's. These are the types of trials of the future. 
There are some additional disturbing factual errors here – the extended family numbers about 5,000, not 8,000. And estimates of the prevalence of the mutation within that family appear to vary from about one-third to one-half, so it’s simply wrong to state that “every member of this family” will develop dementia.

However, those errors are relatively minor, and are completely overshadowed by the massive irony that this is a randomized, placebo-controlled trial. Only 100 of the 300 trial participants will receive the active study drug, crenezumab. The other 200 will be on placebo.

And so, the “trial of the future” held up as a way to get us out of using randomized, placebo-controlled trials is actually a randomized, placebo-controlled trial itself. I hope you can understand why I’m completely baffled that Topol thinks this is evidence of anything.

Finally, I have to ask: how is this the trial of the future, anyway? It is a short-term study on a highly-selected patient population with a specific genetic profile, measuring surrogate markers to provide proof of concept for later, larger studies. Is it just me, or does that sound exactly like the early lovastatin trials of the mid-1980’s, which tested cholesterol reduction in a small population of patients with severe heterozygous familial hypercholesterolemia? Back to the Future, indeed.


[Image: time-travelling supercar courtesy of Flickr user JoshBerglund19.]

Thursday, August 16, 2012

Clinical Trial Alerts: Nuisance or Annoyance?


Will physicians change their answers when tired of alerts?

I am an enormous fan of electronic health records (EMRs).  Or rather, more precisely, I am an enormous fan of what EMRs will someday become – current versions tend to leave a lot to be desired. Reaction to these systems among physicians I’ve spoken with has generally ranged from "annoying" to "*$%#^ annoying", and my experience does not seem to be at all unique.

The (eventual) promise of EMRs in identifying eligible clinical trial participants is twofold:

First, we should be able to query existing patient data to identify a set of patients who closely match the inclusion and exclusion criteria for a given clinical trial. In reality, however, many EMRs are not easy to query, and the data inside them isn’t as well-structured as you might think. (The phenomenon of "shovelware" – masses of paper records scanned and dumped into the system as quickly and cheaply as possible – has been greatly exacerbated by governments providing financial incentives for the immediate adoption of EMRs.)

Second, we should be able to identify potential patients when they’re physically at the clinic for a visit, which is really the best possible moment. Hence the Clinical Trial Alert (CTA): a pop-up or other notification within the EMR that the patient may be eligible for a trial. The major issue with CTAs is the annoyance factor – physicians tend to feel that they disrupt their natural clinical routine, making each patient visit less efficient. Multiple alerts per patient can be especially frustrating, resulting in "alert overload".

A very intriguing study recently in the Journal of the American Medical Informatics Association looked to measure a related issue: alert fatigue, or the tendency for CTAs to lose their effectiveness over time.  The response rate to the alerts definitely decreased steadily over time, but the authors were mildly optimistic in their assessment, noting that response rate was still respectable after 36 weeks – somewhere around 30%:


However, what really struck me here is that the referral rate – the rate at which the alert was triggered to bring in a research coordinator – dropped much more precipitously than the response rate:


This is remarkable considering that the alert consisted of only two yes/no questions. Answering either question was considered a "response", and answering "yes" to both questions was considered a "referral".

  • Did the patient have a stroke/TIA in the last 6 months?
  • Is the patient willing to undergo further screening with the research coordinator?

The only plausible explanation for referrals to drop faster than responses is that repeated exposure to the CTA lead the physicians to more frequently mark the patients as unwilling to participate. (This was not actual patient fatigue: the few patients who were the subject of multiple CTAs had their second alert removed from the analysis.)

So, it appears that some physicians remained nominally compliant with the system, but avoided the extra work involved in discussing a clinical trial option by simply marking the patient as uninterested. This has some interesting implications for how we track physician interaction with EMRs and CTAs, as basic compliance metrics may be undermined by users tending towards a path of least resistance.

ResearchBlogging.org Embi PJ, & Leonard AC (2012). Evaluating alert fatigue over time to EHR-based clinical trial alerts: findings from a randomized controlled study. Journal of the American Medical Informatics Association : JAMIA, 19 (e1) PMID: 22534081

Monday, August 13, 2012

Most* Clinical Trials Are Too** Small

* for some value of "most"
** for some value of "too"


[Note: this is a companion to a previous post, Clouding the Debate on Clinical Trials: Pediatric Edition.]

Are many current clinical trials underpowered? That is, will they not enroll enough patients to adequately answer the research question they were designed to answer? Are we wasting time and money – and even worse, the time and effort of researchers and patient-volunteers – by conducting research that is essentially doomed to produce clinically useless results?

That is the alarming upshot of the coverage on a recent study published in the Journal of the American Medical Association. This Duke Medicine News article was the most damning in its denunciation of the current state of clinical research:
Duke: Mega-Trial experts concerned
that not enough trials are mega-trials
Large-Scale Analysis Finds Majority of Clinical Trials Don't Provide Meaningful Evidence

The largest comprehensive analysis of ClinicalTrials.gov finds that clinical trials are falling short of producing high-quality evidence needed to guide medical decision-making.
The study was also was also covered in many industry publications, as well as the mainstream news. Those stories were less sweeping in their indictment of the "clinical trial enterprise", but carried the same main theme: that an "analysis" had determined that most current clinical trial were "too small".

I have only one quibble with this coverage: the study in question didn’t demonstrate any of these points. At all.

The study is a simple listing of gross characteristics of interventional trials registered over a 6 year period. It is entirely descriptive, and limits itself entirely to data entered by the trial sponsor as part of the registration on ClinicalTrials.gov. It contains no information on the quality of the trials themselves.

That last part can’t be emphasized enough: the study contains no quality benchmarks. No analysis of trial design. No benchmarking of the completeness or accuracy of the data collected. No assessment of the clinical utility of the evidence produced. Nothing like that at all.

So, the question that nags at me is: how did we get from A to B? How did this mildly-interesting-and-entirely-descriptive data listing transform into a wholesale (and entirely inaccurate) denunciation of clinical research?

For starters, the JAMA authors divide registered trials into 3 enrollment groups: 1-100, 101-1000, and >1000. I suppose this is fine, although it should be noted that it is entirely arbitrary – there is no particular reason to divide things up this way, except perhaps a fondness for neat round numbers.

Trials within the first group are then labeled "small". No effort is made to explain why 100 patients represents a clinically important break point, but the authors feel confident to conclude that clinical research is "dominated by small clinical trials", because 62% of registered trials fit into this newly-invented category. From there, all you need is a completely vague yet ominous quote from the lead author. As US News put it:
The new report says 62 percent of the trials from 2007-2010 were small, with 100 or fewer participants. Only 4 percent had more than 1,000 participants.

"There are 330 new clinical trials being registered every week, and a number of them are very small and probably not as high quality as they could be," [lead author Dr Robert] Califf said.
"Probably not as high quality as they could be", while just vague enough to be unfalsifiable, is also not at all a consequence of the data as reported. So, through a chain of arbitrary decisions and innuendo, "less than 100" becomes "small" becomes "too small" becomes "of low quality".

Califf’s institution, Duke, appears to be particularly guilty of driving this evidence-free overinterpretation of the data, as seen in the sensationalistic headline and lede quoted above. However, it’s clear that Califf himself is blurring the distinction between what his study showed and what it didn’t:
"Analysis of the entire portfolio will enable the many entities in the clinical trials enterprise to examine their practices in comparison with others," says Califf. "For example, 96 percent of clinical trials have ≤1000 participants, and 62 percent have ≤ 100. While there are many excellent small clinical trials, these studies will not be able to inform patients, doctors, and consumers about the choices they must make to prevent and treat disease."
Maybe he’s right that these small studies will not be able to inform patients and doctors, but his study has provided absolutely no support for that statement.

When we build a protocol, there are actually only 3 major factors that go into determining how many patients we want to enroll:
  1. How big a difference we estimate the intervention will have compared to a control (the effect size)
  2. How much risk we’ll accept that we’ll get a false-positive (alpha) or false-negative (beta) result
  3. Occasionally, whether we need to add participants to better characterize safety and tolerability (as is frequently, and quite reasonably, requested by FDA and other regulators)
Quantity is not quality: enrolling too many participants in an investigational trial is unethical and a waste of resources. If the numbers determine that we should randomize 80 patients, it would make absolutely no sense to randomize 21 more so that the trial is no longer "too small". Those 21 participants could be enrolled in another trial, to answer another worthwhile question.

So the answer to "how big should a trial be?" is "exactly as big as it needs to be." Taking descriptive statistics and applying normative categories to them is unhelpful, and does not make for better research policy.


ResearchBlogging.org Califf RM, Zarin DA, Kramer JM, Sherman RE, Aberle LH, & Tasneem A (2012). Characteristics of clinical trials registered in ClinicalTrials.gov, 2007-2010. JAMA : the journal of the American Medical Association, 307 (17), 1838-47 PMID: 22550198