Thursday, December 19, 2013

Patient Recruitment: Taking the Low Road

The Wall Street Journal has an interesting article on the use of “Big Data” to identify and solicit potential clinical trial participants. The premise is that large consumer data aggregators like Experian can target patients with certain diseases through correlations with non-health behavior. Examples given include “a preference for jazz” being associated with arthritis and “shopping online for clothes” being an indicator of obesity.
We've seen this story before.

In this way, allegedly, clinical trial patient recruitment companies can more narrowly target their solicitations* for patients to enroll in clinical trials.

In the spirit of full disclosure, I should mention that I was interviewed by the reporter of this article, although I am not quoted. My comments generally ran along three lines, none of which really fit in with the main storyline of the article:

  1. I am highly skeptical that these analyses are actually effective at locating patients
  2. These methods aren't really new – they’re the same tactics that direct marketers have been using for years
  3. Most importantly, the clinical trials community can – and should – be moving towards open and collaborative patient engagement. Relying on tactics like consumer data snooping and telemarketing is an enormous step backwards.

The first point is this: certainly some diseases have correlates in the real world, but these correlates tend to be pretty weak, and are therefore unreliable predictors of disease. Maybe it’s true that those struggling with obesity tend to buy more clothes online (I don’t know if it’s true or not – honestly it sounds a bit more like an association built on easy stereotypes than on hard data). But many obese people will not shop online (they will want to be sure the clothes actually fit), and vast numbers of people with low or average BMIs will shop for clothes online.  So the consumer data will tend to have very low predictive value. The claims that liking jazz and owning cats are predictive of having arthritis are even more tenuous. These correlates are going to be several times weaker than basic demographic information like age and gender. And for more complex conditions, these associations fall apart.

Marketers claim to solve this by factoring a complex web of associations through a magical black box – th WSJ article mentions that they “applied a computed algorithm” to flag patients. Having seen behind the curtain on a few of these magic algorithms, I can confidently say that they are underwhelming in their sophistication. Hand-wavy references to Big Data and Algorithms are just the tools used to impress pharma clients. (The down side to that, of course, is that you can’t help but come across as big brotherish – see this coverage from Forbes for a taste of what happens when people accept these claims uncritically.)

But the effectiveness of these data slice-n-dicing activities is perhaps beside the point. They are really just a thin cover for old-fashioned boiler room tactics: direct mail and telemarketing. When I got my first introduction to direct marketing in the 90’s, it was the exact same program – get lead lists from big companies like Experian, then aggressively mail and call until you get a response.

The limited effectiveness and old-school aggressiveness of these programs comes is nicely illustrated in the article by one person’s experience:
Larna Godsey, of Wichita, Kan., says she received a dozen phone calls about a diabetes drug study over the past year from a company that didn't identify itself. Ms. Godsey, 63, doesn't suffer from the disease, but she has researched it on the Internet and donated to diabetes-related causes. "I don't know if it's just a coincidence or if they're somehow getting my information," says Ms. Godsey, who filed a complaint with the FTC this year.
The article notes that one recruitment company, Acurian, has been the subject of over 500 FTC complaints regarding its tactics. It’s clear that Big Data is just the latest buzzword lipstick on the telemarketing pig. And that’s the real shame of it.

We have arrived at an unprecedented opportunity for patients, researchers, and private industry to come together and discuss, as equals, research priorities and goals. Online patient communities like Inspire and PatientsLikeMe have created new mechanisms to share clinical trial opportunities and even create new studies. Dedicated disease advocates have jumped right into the world of clinical research, with groups like the Cystic Fibrosis Foundation and Michael J. Fox Foundation no longer content with raising research funds, but actively leading the design and operations of new studies.

Some – not yet enough – pharmaceutical companies have embraced the opportunity to work more openly and honestly with patient groups. The scandal of stories like this is not the Wizard of Oz histrionics of secret computer algorithms, but that we as an industry continue to take the low road and resort to questionable boiler room tactics.

It’s past time for the entire patient recruitment industry to drop the sleaze and move into the 21st century. I would hope that patient groups and researchers will come together as well to vigorously oppose these kinds of tactics when they encounter them.

(*According to the article, Acurian "has said that calls related to medical studies aren't advertisements as defined by law," so we can agree to call them "solicitations".)

Wednesday, December 4, 2013

Half of All Trials Unpublished*

(*For certain possibly nonstandard uses of the word "unpublished")

This is an odd little study. Instead of looking at registered trials and following them through to publication, this study starts with a random sample of phase 3 and 4 drug trials that already had results posted on - so in one, very obvious sense, none of the trials in this study went unpublished.

Timing and Completeness of Trial Results Posted at and Published in Journals
Carolina Riveros, Agnes Dechartres, Elodie Perrodeau, Romana Haneef, Isabelle Boutron, Philippe Ravaud

But here the authors are concerned with publication in medical journals, and they were only able to locate journal articles covering about half (297/594) of trials with registered results. 

It's hard to know what to make of these results, exactly. Some of the "missing" trials may be published in the future (a possibility the authors acknowledge), some may have been rejected by one or more journals (FDAAA requires posting the results to, but it certainly doesn't require journals to accept trial reports), and some may be pre-FDAAA trials that sponsors have retroactively added to even though development on the drug has ceased.

It would have been helpful had the authors reported journal publication rates stratified by the year the trials completed - this would have at least given us some hints regarding the above. More than anything I still find it absolutely bizarre that in a study this small, the entire dataset is not published for review.

One potential concern is the search methodology used by the authors to match posted and published trials. If the easy routes (link to article already provided in, or NCT number found in a PubMed search) failed, a manual search was performed:
The articles identified through the search had to match the corresponding trial in terms of the information registered at (i.e., same objective, same sample size, same primary outcome, same location, same responsible party, same trial phase, and same sponsor) and had to present results for the primary outcome. 
So it appears that a reviewed had to score the journal article as an exact match on 8 criteria in order for the trial to be considered the same. That could easily lead to exclusion of journal articles on the basis of very insubstantial differences. The authors provide no detail on this; and again, that would be easy to verify if the study dataset was published. 

The reason I harp on this, and worry about the matching methodology, is that two of the authors of this study were also involved in a methodologically opaque and flawed study about clinical trial results posted in the JCO. In that study, as well, the authors appeared to use an incorrect methodology to identify published clinical trials. When I pointed the issues out, the corresponding author merely reiterated what was already (insufficiently) in the paper's Methodology section.

I find it strange beyond belief, and more than a little hypocritical, that researchers would use a public, taxpayer-funded database as the basis of their studies, and yet refuse to provide their data for public review. There are no technological or logistical issues preventing this kind of sharing, and there is an obvious ethical point in favor of transparency.

But if the authors are reasonably close to correct in their results, I'm not sure what to make of this study. 

The Nature article covering this study contend that
[T]he [] database was never meant to replace journal publications, which often contain longer descriptions of methods and results and are the basis for big reviews of research on a given drug.
I suppose that some journal articles have better methodology sections, although this is far from universally true (and, like this study here, these methods are often quite opaquely described and don't support replication). As for results, I don't believe that's the case. In this study, the opposite was true: results were generally more complete than journal results. And I have no idea why the registry wouldn't surpass journals as a more reliable and complete source of information for "big reviews".

Perhaps it is a function of my love of getting my hands dirty digging into the data, but if we are witnessing a turning point where journal articles take a distant back seat to the registry, I'm enthused. is public, free, and contains structured data; journal articles are expensive, unparsable, and generally written in painfully unclear language. To me, there's really no contest. Carolina Riveros, Agnes Dechartres, Elodie Perrodeau, Romana Haneef, Isabelle Boutron, & Philippe Ravaud (2013). Timing and Completeness of Trial Results Posted at and Published in Journals PLoS Medicine DOI: 10.1371/journal.pmed.1001566