Already a DIA Member? Sign in. Not a member? Join.

Sign in

Forgot User ID? or Forgot Password?

Not a Member?

Create Account and Join

Menu Back to Poster-Presentations-Details

W 35: Comparison of Manual Versus Automated Redaction Techniques for Clinical Submission Documents

Poster Presenter

      Rashmi Dodia

      • Regulatory Operations Specialist II
      • MMS Holdings, Inc.
        United States


This study analyzed the quality and efficiency of a primarily manual redaction process using Adobe Professional software compared to an automated process using advanced commercial redaction tools


ORAL PRESENTATION SCHEDULED: Session 2B at 12:30- 12:40 PM

A clinical study report was redacted using both a manual and automated process. Quality and efficiency measures calculated for both processes was measured and compared.


Redaction of the clinical study report using the manual process that included manual scientific review of the report and redaction using only basic redaction tools and standard functionality of Adobe Professional v.10 took significantly more total process time (more than 50%) than the automated process, which used advanced commercial redaction software followed by data cleanup and manual scientific review. The manual redaction process produced significantly fewer total initial redactions (~7 times less) and no false hits, largely because of the limited availability of automated searching functionality to systematically detect keywords related to company confidential information, patterns in the text related to subject numbers, personal contact information such as email addresses, addresses and telephone numbers or dates in the document text using only Adobe Professional. Because the automated process produced a pre-redacted report based on the advanced keyword, string and pattern searching, false hits that subsequently required cleanup and removal were prevalent with the automated process. Quality, measured by total number of errors/omissions identified during a manual quality review step, was similar regardless of the redaction method used. This indicates that overall quality was not reduced when using the automated process, despite the dramatic decrease in total process time. Quality was improved with the automated process when input from the SDTM datasets for key variables such as subject identifier, lot/batch numbers, staff names and contact information, treatment dates and medical history and physical examination terms were used as input to the commercial redaction software tool. However, this approach required the most time up front related to gathering study and product specific requirements and input into the commercial software tool and produced the most false hits, particularly related to medical history and physical examination terms.


The most effective and efficient process for redaction of clinical submissions documents appears to be a hybrid approach which uses an advanced commercial redaction tool with input from the SDTM datasets for key variables, followed by full scientific review and quality control. This hybrid process ensures that powerful automated text searching algorithms are used to search out potential areas for redaction based on the known key inputs, producing a pre-redacted report that can be then reviewed manually by a trained scientific reviewer and quality control analyst to ensure that variations in presentation of material in the report that are difficult to predict and program into an automated tool are not missed and that false hits produced by the automated tool are removed or adjusted as needed. While Adobe Professional includes adequate tools for performing redaction, it does not include many of the advanced text and pattern searching features that are present in the advanced commercial redaction tools that are available on the market that make complicated text, string and pattern searching much more automated and precise. Most commercial redaction tools surveyed for this study include functionality to easily include SDTM output into redaction algorithms and the results seen in testing were promising. Further exploration into the synergies that can be gained by performing data anonymization and clinical document redaction in parallel is needed to determine the full extent of the efficiencies that can be achieved.

Be informed and stay engaged.

Don't miss an opportunity - join our mailing list to stay up to date on DIA insights and events.