Already a DIA Member? Sign in. Not a member? Join.

Sign in

Forgot User ID? or Forgot Password?

Not a Member?

Create Account and Join

Menu Back to Poster-Presentations-Details

T 23: Patient Reported Outcomes: Comparison of Required Data Cleaning Efforts for ePRO Versus Paper

Poster Presenter

      Jennifer Ross

      • Lead Biostatistician
      • Almac Clinical Technologies
        United States


This poster will illustrate the impacts of low quality PRO data, describe the data cleaning process, compare the process for paper versus ePRO and provide recommendations of how ePRO can be implemented to decrease the level of effort of data cleaning.


ORAL PRESENTATION SCHEDULED: Session 1B at 12:40 - 12:50 PM

As PROs are used as endpoints, high quality data is critical. Data cleaning is a process that occurs prior to analysis to ensure data integrity and reliable results. This poster will describe the data cleaning process and will compare the data cleaning requirements of paper PROs to those for ePRO.


Data cleaning consists of querying, diagnosis, and resolution. The following data queries will be compared: identify records with missing or out-of-range dates / times, missing responses, missing records (entire record / entry), outliers / out-of-range values, contradictory responses / inconsistencies / strange patterns, and extraneous / excess data. Diagnosis includes determining an error’s root cause, confirming missing data, verifying corrective actions, and concluding the issue is unable to be diagnosed. For paper, cross-checking the original paper source is needed to identify if the error is associated with patient entry / staff secondary entry or if data is truly missing. For ePRO, many data cleaning steps can be reduced or eliminated. For instance, as ePRO entries are time / date stamped, data cleaning for missing dates is not needed. ePRO can include time windows to prevent out-of-range dates / times. ePRO systems can be programmed to either prevent skipping responses or to confirm if a patient intended to skip that response, therefore confirming missing responses would not be required. ePRO systems can also be designed to prevent out-of range values, thus eliminating this data cleaning diagnostic. With direct patient data entry, human error introduced with secondary data entry is eliminated. Extraneous / excess data would not exist in ePRO systems as patients would not have the ability to provide two responses when only one is required or be able to write in additional information. Resolution involves handling identified issues: corrections and marking data for exclusion from the analysis. When the original paper is lost, data may be excluded since accuracy cannot be confirmed. Since patients directly enter their responses in the ePRO system, the ePRO data is the source data, so verification can be done directly without having to search / locate an additional source.


As PROs are used as study endpoints, high quality data is critical. Data cleaning is an important process that must occur prior to analysis to ensure data integrity and reliable results. Low quality data can negatively impact analysis, results, and costs. End-of-study time is precious; timely results are needed for submissions. When comparing the effort of data cleaning for paper versus ePRO, the difference is substantial. ePRO can ensure high quality data and optimize timelines by reducing data cleaning time. Data cleaning paper PRO data can be a time consuming process, as it often is like detective work with searching for missing pages, searching to reference the original paper to decipher correct responses, etc. While implementing a high quality ePRO may take more time and money at the study’s start compared to paper, ePRO can reduce time required for data cleaning with preventing errors and producing higher quality data with higher patient compliance. If ePRO systems are well-planned / implemented, a number of data cleaning processes are no longer needed. Data cleaning process may identify a high amount of missing data. Missing data can also impact the data’s analyzability, where if there is a large percentage of missing data, the impact can be significant. A number of data cleaning steps require going back to the original paper form to confirm, which can be an issue if the original paper form cannot be found or if the original paper form got corrupted, making it impossible to provide confirmation. If any findings from the queries cannot be confirmed with looking at the original paper form, this data may have to be set to missing. Well-planned ePRO systems can produce higher quality data and also higher patient compliance, so the amount of missing data is substantially reduced. When implementing ePRO, proper planning should be done to benefit from the advantages of ePRO. Additional authors include: Elisa Holzbaur, Tracey Rothrock.

Be informed and stay engaged.

Don't miss an opportunity - join our mailing list to stay up to date on DIA insights and events.