For pure pathogen-killing power, it’s hard to beat a surgeon’s hand scrub. Ask any clinician, and she’ll tell you how thoroughly chlorhexidine disinfects skin. If she’s a microbiologist, she’ll even explain to you the biocide’s mechanism of action–provided you’re still listening. But how would the practice fare, say, as a method of cold and flu prevention on a college campus? Your skepticism here would seem justified. After all, it’s hard to sterilize a cough in the dining hall.
Efficacy and effectiveness. It’s unfortunate their phonetics are so close, because while the terms do refer to relative locations along a continuum, they’re the furthest thing from synonyms, as the ever accumulating literature on the topic will attest.
In this post and the one that follows, I’d like to offer some clarity on efficacy vs. effectiveness and illustrate the value that each type of analysis offers. If nothing else, what emerges should provide an introduction to the concepts for those new to clinical research. But I have a more speculative aim, too. I’d like propose standards for assessing trial technology through each of these lenses. Why? Because while we’ve been asking whether a particular technology does what it’s explicitly designed to do, as we should and must, we may have forgotten to ask a critical follow-up question: Does it improve the pace and reliability of our research?
A Gloss on Effectiveness Versus Efficacy
Before turning to the differences, it’s worth stating what efficacy and effectiveness have in common. Both are measures of intervention results, arrived at through the collection of empirical data. The differences reside in what kinds of results, and which types data, researchers consider relevant in each case.
Efficacy results derive from data gathered through controlled methods, under ideal circumstances, from participants meeting precise and strictly enforced criteria. We glean them from blood panels and CT scans, from standardized scores and thorough exams, all conducted by clinicians with domain expertise. Highly specific, medical questions drive the search for efficacy. Will infusion of this dose, administered to patients with this diagnosis and these relevant histories, reduce serum creatinine by at least this percent over this time span? There are virtues to gathering evidence in this rigorous way. Tightly controlling treatment variables reduces statistical noise, so that signals confirming or disconfirming a hypothesized mechanism of action, an optimal dose, or a set of easily measured clinical outcomes are clear. No basic research and virtually no translational research could proceed without it.
Effectiveness results, on the other hard, while no less data-driven, rely on methods, circumstances and participants that are as diverse as those found in the wider world. We solicit them through surveys and observation, economic analyses and public health records. Taking this broader view allows researchers to answer a more general but no less urgent question than the creatinine example above: Will the therapy benefit patients? In the attempt to find out, measuring efficacy is just one step in an investigation that journeys far outside the lab or imaging suite. Along the way, researchers ask questions whose answers aren’t expressed in units as concrete as mL/min. How likely are care providers to consider prescribing the therapy? Are their patients, in all their diversity, similar in the ways that matter to the earlier trial participants? How likely are these patients to comply with the regimen, both initially and as side effects arise? Answers to questions like these determine effectiveness. And it’s difficult to see how health policy, or patient and care provider decision-making, could proceed without it.
Just as a single trial rarely settles the case once and for all regarding an intervention’s efficacy, so too is the question of effectiveness answered over time. But it is possible to “front load” a particular study, so that results enable a better prediction of how well the intervention will perform in the real world. As studies seeking to quantify the results of an intervention, these effectiveness studies differ in degree, not in kind, from efficacy studies. But if there is no bright-line test to distinguish the two, there are at least markers that can help us place a study closer to one pole or another. In their paper prepared for the U.S. Department of Health and Human Services, Drs. Gerald Gartlehner, Richard Hansen, Daniel Nissman, Kathleen Lohr and Timothy Carey propose and validate seven criteria that distinguish an effectiveness study from an efficacy one:
- Populations in Primary Care: “For effectiveness trials, settings should reflect the initial care facilities available to a diverse population with the condition of interest.”
- Less Stringent Eligibility Criteria: “[E]ligibility criteria must allow the source population to reflect the heterogeneity of external populations: the full spectrum of the human population, their comorbidities, variable compliance rates, and use of other medications (or other therapies, such as psychotherapies, or complementary and alternative medications).”
- Health Outcomes: “Efficacy studies, especially phase III clinical trials, commonly use objective or subjective outcomes (e.g., symptom scores, laboratory data, or time to disease recurrence) to determine intermediate (surrogate) outcomes … Health outcomes, relevant to the condition of interest, should be the principal outcome measures in effectiveness studies. Intermediate outcomes are adequate only if empirical evidence verifies that the effect of the intervention on an intermediate endpoint predicts and fully captures the net effect on a health outcome.”
- Long Study Duration, Clinically Relevant Treatment Modalities: “In effectiveness trials, study durations should mimic a minimum length of treatment in a clinical setting to allow the assessment of health outcomes. Treatment modalities should reflect clinical relevance (e.g., no fixed-dose designs; equivalent dosages for head-to-head comparisons). Diagnosis should rely on diagnostic standards that practicing physicians use.”
- Assessment of adverse events: “[U]sing an extensive objective adverse events scale is often not feasible in daily clinical practice because of time constraints and practical considerations. Therefore, adverse events assessments in effectiveness trials could be limited to critical issues based on experiences from prior trials.”
- Adequate Sample Size To Assess a Minimally Important Difference From a Patient Perspective: “The sample size of an effectiveness trial should be sufficient to detect at least a minimally important difference on a health-related quality of life scale.”
- Intention-to-treat (ITT) analysis: “statistical analyses in efficacy trials frequently exclude patients with protocol deviations. In clinical practice, however, factors such as compliance, adverse events, drug regimens, co-morbidities, concomitant treatments, or costs all can alter efficacy. A ‘completers only’ analysis would not take these factors adequately into account.”
To put it simply (at the expense of precision):
Effectiveness studies in clinical research measure the long-term health changes of large, heterogeneous groups that result from an intervention in their primary, and perhaps their secondary, care. To gather sufficient data for this task, these studies need to assess “all-comers” in their analysis and minimize the burden of participation for researchers and subjects.
With this pseudo-definition, it’s not hard to imagine paradigm examples; e.g. “A Comparison of Gel-Based Sanitizers versus Bar Soap and Water in the Frequency of RSV Infection Among Child Care Center Workers”. What is challenging is crossing domains and seeing how the same approach could be applied to clinical trial technology. Who are the subjects? What are the relevant treatment contexts? What outcomes are key? That’ll be the subject for our next post.
Gartlehner G, Hansen RA, Nissman D, et al. Criteria for Distinguishing Effectiveness From Efficacy Trials in Systematic Reviews. Rockville (MD): Agency for Healthcare Research and Quality (US); 2006 Apr. (Technical Reviews, No. 12.) Available from: https://www.ncbi.nlm.nih.gov/books/NBK44029/