Solving EHR Data Gaps with OpenClinica Blog Banner
EHR-to-EDC | eSource

From Missing to Meaningful

Solving EHR Data Gaps

There are many compelling reasons to use electronic health records (EHRs) in clinical research. A 2024 study, Utilization of EHRs for clinical trials: a systematic review, identified 19 use cases, including:

  • EHRs as a potential source for assessing patients’ eligibility for enrollment in clinical trials
  • Improvement of patient recruitment, patient retention, and data collection through using the EHR
  • A novel method using an EHR phenotype plus brief medical record review is effective to identify hospitalized patients with late-stage dementia
  • Recruitment and observable clinical outcomes of COVID-19 clinical trials

EHR-based trials, sometimes called pragmatic trials, leverage the data already available in EHRs to identify eligible patients, collect data on outcomes, and assess the effectiveness of real-world settings.

As compelling as it is to use EHR data in clinical trials, there is a known risk – missing or incomplete data – that can lead to biased results and lower statistical power. Because EHRs are designed primarily for clinical and billing purposes – not for research – data useful for clinical research may be missing or inconsistently recorded. Missing data is a recurring challenge that impacts data quality, efficiency, and decision-making for trial sites and sponsors. Mitigating missing data is one of the many reasons clinical trials turn to OpenClinica.

But the impact goes beyond research teams. When data is missing or delayed, it can slow down trial progress, leading to longer wait times for patients in need of new treatments. For individuals facing serious or rare conditions, every day counts. That’s why OpenClinica is committed to improving data quality—not just to support sponsors, but to help bring life-changing therapies to patients faster.

Why Incomplete EHR Data is a Problem 

The common causes of incomplete or missing EHR data are inconsistent documentation, EHR system limitations and provider workflows. The causes fall into three buckets:

  • Missing Completely at Random (MCAR): The absence of data is unrelated to any observed or unobserved variables. For example, a lab result could be missing due to a clerical error.
  • Missing at Random (MAR): The absence of data is related to other observed variables in the data set. For example, missing blood pressure readings or colonoscopies might be related to the patient’s age.
  • Missing Not at Random (MNAR): The absence of data is related to the unobserved value. For example, patients might be less likely to report sensitive information, like alcohol or drug consumption.

Unstructured data can also lead to missing information in a patient’s EHR. Unstructured data typically includes details about patients’ symptoms, history and other elements not captured by coded, organized data. Given its lack of structure, unstructured data can be challenging to extract and analyze.

Missing data is problematic in clinical research because it can lead to delays in study timelines, increased manual effort, and risks to data integrity. Other possible consequences are:

  • Biased estimates
  • Reduced statistical power
  • Compromised generalizability
  • Invalid conclusions
  • Increased risk of Type I (falsely rejecting a true null hypothesis) and Type II errors (failing to reject a false null hypothesis)
  • Exacerbated health inequities
  • Ethical and legal implications

How OpenClinica Mitigates the Challenge of Missing and Incomplete EHR Data

OpenClinica helps clinical trials solve the challenge of missing and incomplete EHR data in three ways:

  • Adaptive, site-aware mapping designed to handle local EHR variations without reengineering
    OpenClinica has a flexible EHR-to-EDC mapping engine that supports site-specific EHR structures and adapts to variability. It also ensures that as much relevant data as possible is captured reliably.

In the words of a Principal Solutions Architect,

We save an enormous amount of time and eliminate errors because we can pull laboratory and medication data from our EHR at the click of a button.”

  • Clinically aware data quality logic that distinguishes between a null value and a meaningful omission
    OpenClinica’s automated checks flag empty fields or missing values. Our configurable logic differentiates between clinically irrelevant nulls and meaningful omissions. In total, our workflows reduce manual review burden while improving data quality.
  • User Feedback and Oversight
    Clinical trial sites are able to review, confirm and/or supplement data pulled from EHRs. Likewise, our platform ensures sponsors gain transparency on completeness and confidence levels of each data point.

To learn more about OpenClinica’s EHR to EDC eSource solution, click here.