Menu Back to Poster-Presentations-Details

W-39: Using Synthetic Control Databases to Accelerate Indication-Specific Safety and Efficacy Evidence

Poster Presenter

      Colin Neate

      • Biostatistics Associate Director
      • Roche


We will describe results investigating the potential for Synthetic control databases (SCD) of recent clinical trial data to support development in metastatic breast cancer (mBC) by providing detailed insights into disease characteristics, treatment response, outcomes and safety in this population.


Eligible studies were screened for patients matching selected mBC indications. Data were standardized into Study Data Tabulation Model (SDTM)-like domains across trials. Interactive tabular summaries and graphics, including forest plots and survival curves were utilised for analysis.


Selected indications for this pilot were second line and higher (2L+) hormone receptor positive/HER2 negative (HR+/HER2-) and triple negative (TNBC) mBC. Phase II and III open-label studies having completed their primary analysis were selected from the Medidata Enterprise Data Store (MEDS) of >6500 clinical trials with clinical data rights for de-identified aggregated analyses. The SCD for this pilot study is updated quarterly and currently contains 1201 patients (779 HR+/HER2-, 422 TNBC) enrolled between 2010 and 2017. The following illustrate findings from the SCD at time of writing. The final poster will display results from the most recent update. Patients in both groups had similar ECOG performance status (63% PS0, 36% PS1). All TNBC patients were female with a median age at diagnosis of 42.5years. 66.4% were Caucasian, 17.5% Asian and 5.2% African American. The HR+/HER2- group included 17 males and was slightly older (median=47.6years). 57.8% were Caucasian, 30.2% Asian and 2.3% African American. At study entry, TNBC patients were more likely than the HR+/HER2- group to have brain (17% vs 5%) or lung (50% vs 33%) metastases and less likely to have bone (33% vs 70%) or liver (23% vs 35%) mets. Lab data, such as baseline albumin and LDH are present for the majority of patients, but cancer antibody data (e.g., CA-15-3, CA 27-29), is limited (present for 26%, 10%). In the TNBC cohort, median overall survival (OS) across treatment arms was 452days; 4.3% had a complete response (CR) per RECIST 1.1, 23.9% a partial response (PR); median progression-free survival (PFS) was 245days. Median OS for the HR+/HER2- cohort was 678days; 2.2% had a CR and 25.7% a PR; median PFS was 294days. The SCD allows investigation of specific subgroups and outcomes. For example, non Hispanic/Latino TNBC patients with PS0 had better OS than those with PS1/2 (median 572 vs 359days), but the opposite trend was seen in Hispanic/Latino patients (median OS for PS0 =424 vs PS1/2 =472days


This ongoing pilot study for mBC indicates that synthetic control databases using clinical trial data have potential to be used to aid research programs and future study designs by providing data-driven benchmarks of clinical response, outcomes and safety, insights into disease subsets and correlations between short and long-term endpoints. Contemporary clinical trial data are considered a potential robust source of information for advancing drug development programs. Arguably SCDs provide higher level data quality and patients who are more similar to other clinical trials than real world data (RWD), and if such a data source once created is regularly updated as new data become eligible, this will ensure relevance for research purposes. While data is robust in aggregate for the 2L+ mBC cohort, at this point the degree of disease-specific variables available and the size of the SCD limit the level of insights that can be offered for questions relating to specific mBC subpopulations, such as outcomes in patients who received a particular prior treatment regimen as first-line therapy. Questions addressed by SCDs are also limited to the data collected in the underlying studies, which can vary substantially across sponsors. Similarly, due to heterogeneity across studies in the type and format of data collected, significant manual review is required when creating the SCD (e.g. to map metastatic terms or to identify line of prior therapy). Such review may facilitate future supervised machine learning term assignments (technology solution), while standardization initiatives for clinical trial data collection will facilitate more efficient creation and accurate usage of multi-trial SCDs (process solution). The ability to address these challenges will determine the longer term feasibility to use SCDs to support drug development decisions.