Placebo Control: what would tufte do?

Showing posts with label what would tufte do?. Show all posts

Tuesday, October 2, 2012

Decluttering the Dashboard

It’s Gresham’s Law for clinical trial metrics: Bad data drives out good. Here are 4 steps you can take to fix it.

Many years ago, when I was working in the world of technology startups, one “serial entrepreneur” told me about a technique he had used when raising investor capital for his new database firm: since his company marketed itself as having cutting-edge computing algorithms, he went out and purchased a bunch of small, flashing LED lights and simply glued them onto the company’s servers. When the venture capital folks came out for due diligence meetings, they were provided a dramatic view into the darkened server room, brilliantly lit up by the servers’ energetic flashing. It was the highlight of the visit, and got everyone’s investment enthusiasm ratcheted up a notch.

The clinical trials dashboard is a candy store: bright, vivid,
attractive ... and devoid of nutritional value.

I was reminded of that story at a recent industry conference, when I naively walked into a seminar on “advanced analytics” only to find I was being treated to an extended product demo. In this case, a representative from one of the large CROs was showing off the dashboard for their clinical trials study management system.

And an impressive system it was, chock full of bubble charts and histograms and sliders. For a moment, I felt like a kid in a candy store. So much great stuff ... how to choose?

Then the presenter told a story: on a recent trial, a data manager in Italy, reviewing the analytics dashboard, alerted the study team to the fact that there was an enrollment imbalance in Japan, with one site enrolling all of the patients in that country. This was presented as a success story for the system: it linked up disparate teams across the globe to improve study quality.

But to me, this was a small horror story: the dashboard had gotten so cluttered that key performance issues were being completely missed by the core operations team. The fact that a distant data manager had caught the issue was a lucky break, certainly, but one that should have set off alarm bells about how important signals were being overwhelmed by the noise of charts and dials and “advanced visualizations”.

Swamped with high-precision trivia

I do not need to single out any one system or vendor here: this is a pervasive problem. In our rush to provide “robust analytic solutions”, our industry has massively overengineered its reporting interfaces. Every dashboard I've had a chance to review – and I've seen a lot of them – contain numerous instances of vividly-colored charts crowding out one another, with minimal sense of differentiating the significant from the tangential.

It’s Gresham’s Law for clinical trial metrics: Bad data drives out good. Bad data – samples sliced so thin they’ve lost significance, histograms of marginal utility made “interesting” (and nearly unreadable) by 3-D rendering, performance grades that have never been properly validated. Bad data is plentiful and much, much easier to obtain than good data.

So what can we do? Here are 4 initial steps to decluttering the dashboard:

1. Abandon “Actionable Analytics”
Everybody today sells their analytics as “actionable” [including, to be fair, even one company’s website that the author himself may be guilty of drafting]. The problem though is that any piece of data – no matter how tenuous and insubstantial -- can be made actionable. We can always think of some situation where an action might be influenced by it, so we decide to keep it. As a result, we end up swamped with high-precision trivia (Dr. Smith is enrolling at the 82nd percentile among UK sites!) that do not influence important decisions but compete for our attention. We need to stop reporting data simply because it’s there and we can report it.

2. Identify Key Decisions First
The above process (which seems pretty standard nowadays) is backwards. We look at the data we have, and ask ourselves whether it’s useful. Instead, we need to follow a more disciplined process of first asking ourselves what decisions we need to make, and when we need to make them. For example:

When is the earliest we will consider deactivating a site due to non-enrollment?
On what schedule, and for which reasons, will senior management contact individual sites?
At what threshold will imbalances in safety data trigger more thorough investigation?

Every trial will have different answers to these questions. Therefore, the data collected and displayed will also need to be different. It is important to invest time and effort to identify critical benchmarks and decision points, specific to the needs of the study at hand, before building out the dashboard.

3. Recognize and Respect Context
As some of the questions about make clear, many important decisions are time-dependent. Often, determining when you need to know something is every bit as important as determining what you want to know. Too many dashboards keep data permanently anchored over the course of the entire trial even though it's only useful during a certain window. For example, a chart showing site activation progress compared to benchmarks should no longer be competing for attention on the front of a dashboard after all sites are up and running – it will still be important information for the next trial, but for managing this trial now, it should no longer be something the entire team reviews regularly.

In addition to changing over time, dashboards should be thoughtfully tailored to major audiences. If the protocol manager, medical monitor, CRAs, data managers, and senior executives are all looking at the same dashboard, then it’s a dead certainty that many users are viewing information that is not critical to their job function. While it isn't always necessary to develop a unique topline view for every user, it is worthwhile to identify the 3 or 4 major user types, and provide them with their own dashboards (so the person responsible for tracking enrollment in Japan is in a position to immediately see an imbalance).

4. Give your Data Depth
Many people – myself included – are reluctant to part with any data. We want more information about study performance, not less. While this isn't a bad thing to want, it does contribute to the tendency to cram as much as possible into the dashboard.

The solution is not to get rid of useful data, but to bury it. Many reporting systems have the ability to drill down into multiple layers of information: this capability should be thoughtfully (but aggressively!) used to deprioritize all of your useful-but-not-critical data, moving it off the dashboard and into secondary pages.

Bottom Line
The good news is that access to operational data is becoming easier to aggregate and monitor every day. The bad news is that our current systems are not designed to handle the flood of new information, and instead have become choked with visually-appealing-but-insubstantial chart candy. If we want to have any hope of getting a decent return on our investment from these systems, we need to take a couple steps back and determine: what's our operational strategy, and who needs what data, when, in order to successfully execute against it?

[Photo credit: candy store from flikr user msgolightly.]

Sunday, July 15, 2012

Site Enrollment Performance: A Better View

Pretty much everyone involved in patient recruitment for clinical trials seems to agree that "metrics" are, in some general sense, really really important. The state of the industry, however, is a bit dismal, with very little evidence of effort to communicate data clearly and effectively. Today I’ll focus on the Site Enrollment histogram, a tried-but-not-very-true standby in every trial.

Consider this graphic, showing enrolled patients at each site. It came through on a weekly "Site Newsletter" for a trial I was working on:

I chose this histogram not because it’s particularly bad, but because it’s supremely typical. Don’t get me wrong ... it’s really bad, but the important thing here is that it looks pretty much exactly like every site enrollment histogram in every study I’ve ever worked on.

This is a wasted opportunity. Whether we look at per-site enrollment with internal teams to develop enrollment support plans, or share this data with our sites to inform and motivate them, a good chart is one of the best tools we have. To illustrate this, let’s look at a few examples of better ways to look at the data.

If you really must do a static site histogram, make it as clear and meaningful as possible.

This chart improves on the standard histogram in a few important ways:

It looks better. This is not a minor point when part of our work is to engage sites and makes them feel like they are part of something important. Actually, this graph is made clearer and more appealing mostly by the removal of useless attributes (extraneous whitespace, background colors, and unhelpful labels).
It adds patient disposition information. Many graphs – like the one at the beginning of this post – are vague about who is being counted. Does "enrolled" include patients currently being screened, or just those randomized? Interpretations will vary from reader to reader. Instead, this chart makes patient status an explicit variable, without adding to the complexity of the presentation. It also provides a bit of information about recent performance, by showing patients who have been consented but not yet fully screened.
It ranks sites by their total contribution to the study, not by the letters in the investigator’s name. And that is one of the main reasons we like to share this information with our sites in the first place.

Find Opportunities for Alternate Visualizations

There are many other ways in which essentially the same data can be re-sliced or restructured to underscore particular trends or messages. Here are two that I look at frequently, and often find worth sharing.

Then versus Now

This tornado chart is an excellent way of showing site-level enrollment trajectory, with each sites prior (left) and subsequent (right) contributions separated out. This example spotlights activity over the past month, but for slower trials a larger timescale may be more appropriate. Also, how the data is sorted can be critical in the communication: this could have been ranked by total enrollment, but instead sorts first on most-recent screening, clearly showing who’s picked up, who’s dropped off, and who’s remained constant (both good and bad).

This is especially useful when looking at a major event (e.g., pre/post protocol amendment), or where enrollment is expected to have natural fluctuations (e.g., in seasonal conditions).

Net Patient Contribution

In many trials, site activation occurs in a more or less "rolling" fashion, with many sites not starting until later in the enrollment period. This makes simple enrollment histograms downright misleading, as they fail to differentiate sites by the length of time they’ve actually been able to enroll. Reporting enrollment rates (patients per site per month) is one straightforward way of compensating for this, but it has the unfortunate effect of showing extreme (and, most importantly, non-predictive), variance for sites that have not been enrolling for very long.

As a result, I prefer to measure each site in terms of its net contribution to enrollment, compared to what it was expected to do over the time it was open:

To clarify this, consider an example: A study expects sites to screen 1 patient per month. Both Site A and Site B have failed to screen a single patient so far, but Site A has been active for 6 months, whereas Site B has only been active 1 month.

On an enrollment histogram, both sites would show up as tied at 0. However, Site A’s 0 is a lot more problematic – and predictive of future performance – than Site B’s 0. If I compare them to benchmark, then I show how many total screenings each site is below the study’s expectation: Site A is at -6, and Site B is only -1, a much clearer representation of current performance.

This graphic has the added advantage of showing how the study as a whole is doing. Comparing the total volume of positive to negative bars gives the viewer an immediate visceral sense of whether the study is above or below expectations.

The above are just 3 examples – there is a lot more that can be done with this data. What is most important is that we first stop and think about what we’re trying to communicate, and then design clear, informative, and attractive graphics to help us do that.