Thursday, January 2, 2014

The Coming of the MOOCT?

Big online studies, in search of millions of participants.

Back in September, I enrolled in the Heath eHeart Study - an entirely online research study tracking cardiac health. (Think Framingham Heart, cast wider and shallower - less intensive follow-up, but spread out to the entire country.)


[In the spirit of full disclosure, I should note that I haven’t completed any follow-up activities on the Heath eHeart website yet. Yes, I am officially part of the research adherence problem…]


Yesterday, I learned of the Quantified Diet Project, an entirely online/mobile app-supported randomized trial of 10 different weight loss regimens. The intervention is short - only 4 weeks - but that’s probably substantially longer than most New Year diets manage to last, and should be just long enough to detect some early differences among the approaches.


I have been excited about the potential for online medical research for quite some time. For me, the real beginning was when PatientsLikeMe published the results of their online lithium for ALS research study - as I wrote at the time, I have never been so enthused about a negative trial before or since.



That was two and a half years ago, and there hasn't been a ton of activity since then outside of PatientsLikeMe (who have expanded and formalized their activities in the Open Research Exchange). So I’m eager to hear how these two new studies go. There are some interesting similarities and differences:


  • Both are university/private collaborations, and both (perhaps unsurprisingly) are rooted in California: Heath eHeart is jointly run by UCSF and the American Heart Association, while Quantified Diet is run by app developer Lift with scientific support from a (unidentified?) team at Berkeley.
  • Both are pushing for a million or more participants, dwarfing even very large traditional studies by orders of magnitude.
  • Health eHeart is entirely observational, and researchers will have the ability to request its data to test their own hypotheses, whereas Quantified Diet is a controlled, randomized trial.


Data entry screen on Health eHeart
I really like the user interface for Heath eHeart - it’s extremely simple, with a logical flow to the sections. It clearly appears to be designed for older participants, and the extensive data intake is subdivided into a large number of subsections, each of which can typically be completed in 2-4 minutes.



I have not enrolled into the Quantified Diet, but it appears to have a strong social media presence. You can follow the Twitter conversation through the #quantdiet hashtag. The semantic web and linked data guru Kerstin Forsberg has already posted about joining, and I hope to hear more from her and from clinical trial social media expert Rahlyn Gossen, who’s also joined.


To me, probably the most intriguing technical feature of the QuantDiet study is its “voluntary randomization” design. Participants can self-select into the diet of their choice, or can choose to be randomly assigned by the application. It will be interesting to see whether any differences emerge between the participants who chose a particular arm and those who were randomized into that arm - how much does a person’s preference matter?


In an earlier tweet I asked, “is this a MOOCT?” - short for Massive Open Online Clinical Trial. I don’t know if that’s the best name for it, and l’d love to hear other suggestions. By any other name, however, these are still great initiatives and I look forward to seeing them thrive in the coming years.

The implications for pharmaceutical and medical device companies is still unclear. Pfizer's jump into world of "virtual trials" was a major bust, and widely second-guessed. I believe there is definitely a role and a path forward here, and these big efforts may teach us a lot about how patients want to be engaged online.

Thursday, December 19, 2013

Patient Recruitment: Taking the Low Road

The Wall Street Journal has an interesting article on the use of “Big Data” to identify and solicit potential clinical trial participants. The premise is that large consumer data aggregators like Experian can target patients with certain diseases through correlations with non-health behavior. Examples given include “a preference for jazz” being associated with arthritis and “shopping online for clothes” being an indicator of obesity.
We've seen this story before.

In this way, allegedly, clinical trial patient recruitment companies can more narrowly target their solicitations* for patients to enroll in clinical trials.

In the spirit of full disclosure, I should mention that I was interviewed by the reporter of this article, although I am not quoted. My comments generally ran along three lines, none of which really fit in with the main storyline of the article:

  1. I am highly skeptical that these analyses are actually effective at locating patients
  2. These methods aren't really new – they’re the same tactics that direct marketers have been using for years
  3. Most importantly, the clinical trials community can – and should – be moving towards open and collaborative patient engagement. Relying on tactics like consumer data snooping and telemarketing is an enormous step backwards.

The first point is this: certainly some diseases have correlates in the real world, but these correlates tend to be pretty weak, and are therefore unreliable predictors of disease. Maybe it’s true that those struggling with obesity tend to buy more clothes online (I don’t know if it’s true or not – honestly it sounds a bit more like an association built on easy stereotypes than on hard data). But many obese people will not shop online (they will want to be sure the clothes actually fit), and vast numbers of people with low or average BMIs will shop for clothes online.  So the consumer data will tend to have very low predictive value. The claims that liking jazz and owning cats are predictive of having arthritis are even more tenuous. These correlates are going to be several times weaker than basic demographic information like age and gender. And for more complex conditions, these associations fall apart.

Marketers claim to solve this by factoring a complex web of associations through a magical black box – th WSJ article mentions that they “applied a computed algorithm” to flag patients. Having seen behind the curtain on a few of these magic algorithms, I can confidently say that they are underwhelming in their sophistication. Hand-wavy references to Big Data and Algorithms are just the tools used to impress pharma clients. (The down side to that, of course, is that you can’t help but come across as big brotherish – see this coverage from Forbes for a taste of what happens when people accept these claims uncritically.)

But the effectiveness of these data slice-n-dicing activities is perhaps beside the point. They are really just a thin cover for old-fashioned boiler room tactics: direct mail and telemarketing. When I got my first introduction to direct marketing in the 90’s, it was the exact same program – get lead lists from big companies like Experian, then aggressively mail and call until you get a response.

The limited effectiveness and old-school aggressiveness of these programs comes is nicely illustrated in the article by one person’s experience:
Larna Godsey, of Wichita, Kan., says she received a dozen phone calls about a diabetes drug study over the past year from a company that didn't identify itself. Ms. Godsey, 63, doesn't suffer from the disease, but she has researched it on the Internet and donated to diabetes-related causes. "I don't know if it's just a coincidence or if they're somehow getting my information," says Ms. Godsey, who filed a complaint with the FTC this year.
The article notes that one recruitment company, Acurian, has been the subject of over 500 FTC complaints regarding its tactics. It’s clear that Big Data is just the latest buzzword lipstick on the telemarketing pig. And that’s the real shame of it.

We have arrived at an unprecedented opportunity for patients, researchers, and private industry to come together and discuss, as equals, research priorities and goals. Online patient communities like Inspire and PatientsLikeMe have created new mechanisms to share clinical trial opportunities and even create new studies. Dedicated disease advocates have jumped right into the world of clinical research, with groups like the Cystic Fibrosis Foundation and Michael J. Fox Foundation no longer content with raising research funds, but actively leading the design and operations of new studies.

Some – not yet enough – pharmaceutical companies have embraced the opportunity to work more openly and honestly with patient groups. The scandal of stories like this is not the Wizard of Oz histrionics of secret computer algorithms, but that we as an industry continue to take the low road and resort to questionable boiler room tactics.

It’s past time for the entire patient recruitment industry to drop the sleaze and move into the 21st century. I would hope that patient groups and researchers will come together as well to vigorously oppose these kinds of tactics when they encounter them.

(*According to the article, Acurian "has said that calls related to medical studies aren't advertisements as defined by law," so we can agree to call them "solicitations".)

Wednesday, December 4, 2013

Half of All Trials Unpublished*

(*For certain possibly nonstandard uses of the word "unpublished")

This is an odd little study. Instead of looking at registered trials and following them through to publication, this study starts with a random sample of phase 3 and 4 drug trials that already had results posted on ClinicalTrials.gov - so in one, very obvious sense, none of the trials in this study went unpublished.

Timing and Completeness of Trial Results Posted at ClinicalTrials.gov and Published in Journals
Carolina Riveros, Agnes Dechartres, Elodie Perrodeau, Romana Haneef, Isabelle Boutron, Philippe Ravaud



But here the authors are concerned with publication in medical journals, and they were only able to locate journal articles covering about half (297/594) of trials with registered results. 

It's hard to know what to make of these results, exactly. Some of the "missing" trials may be published in the future (a possibility the authors acknowledge), some may have been rejected by one or more journals (FDAAA requires posting the results to ClinicalTrials.gov, but it certainly doesn't require journals to accept trial reports), and some may be pre-FDAAA trials that sponsors have retroactively added to ClinicalTrials.gov even though development on the drug has ceased.

It would have been helpful had the authors reported journal publication rates stratified by the year the trials completed - this would have at least given us some hints regarding the above. More than anything I still find it absolutely bizarre that in a study this small, the entire dataset is not published for review.

One potential concern is the search methodology used by the authors to match posted and published trials. If the easy routes (link to article already provided in ClinicalTrials.gov, or NCT number found in a PubMed search) failed, a manual search was performed:
The articles identified through the search had to match the corresponding trial in terms of the information registered at ClinicalTrials.gov (i.e., same objective, same sample size, same primary outcome, same location, same responsible party, same trial phase, and same sponsor) and had to present results for the primary outcome. 
So it appears that a reviewed had to score the journal article as an exact match on 8 criteria in order for the trial to be considered the same. That could easily lead to exclusion of journal articles on the basis of very insubstantial differences. The authors provide no detail on this; and again, that would be easy to verify if the study dataset was published. 

The reason I harp on this, and worry about the matching methodology, is that two of the authors of this study were also involved in a methodologically opaque and flawed study about clinical trial results posted in the JCO. In that study, as well, the authors appeared to use an incorrect methodology to identify published clinical trials. When I pointed the issues out, the corresponding author merely reiterated what was already (insufficiently) in the paper's Methodology section.

I find it strange beyond belief, and more than a little hypocritical, that researchers would use a public, taxpayer-funded database as the basis of their studies, and yet refuse to provide their data for public review. There are no technological or logistical issues preventing this kind of sharing, and there is an obvious ethical point in favor of transparency.

But if the authors are reasonably close to correct in their results, I'm not sure what to make of this study. 

The Nature article covering this study contend that
[T]he [ClinicalTrials.gov] database was never meant to replace journal publications, which often contain longer descriptions of methods and results and are the basis for big reviews of research on a given drug.
I suppose that some journal articles have better methodology sections, although this is far from universally true (and, like this study here, these methods are often quite opaquely described and don't support replication). As for results, I don't believe that's the case. In this study, the opposite was true: ClinicalTrial.gov results were generally more complete than journal results. And I have no idea why the registry wouldn't surpass journals as a more reliable and complete source of information for "big reviews".

Perhaps it is a function of my love of getting my hands dirty digging into the data, but if we are witnessing a turning point where journal articles take a distant back seat to the ClinicalTrials.gov registry, I'm enthused. ClinicalTrials.gov is public, free, and contains structured data; journal articles are expensive, unparsable, and generally written in painfully unclear language. To me, there's really no contest. 

ResearchBlogging.org Carolina Riveros, Agnes Dechartres, Elodie Perrodeau, Romana Haneef, Isabelle Boutron, & Philippe Ravaud (2013). Timing and Completeness of Trial Results Posted at ClinicalTrials.gov and Published in Journals PLoS Medicine DOI: 10.1371/journal.pmed.1001566