Can endpoint terms be extracted and normalized from clinical trial registries to build visualizations to facilitate trial planning and competitive intelligence?
We searched both US and EU registries for two diseases (Merkel cell carcinoma and progressive multiple sclerosis) and combined the data into a single table for each disease. We used text-mining tools to extract and normalize endpoint terms and then created visualizations to facilitate analysis.
Endpoints or outcomes in the US and EU trial registries are not entered using a controlled terminology. Even with established endpoints we found variations in wording across trial records and between different registry records for the same trial. Using text-mining software we were able identify endpoint term variations. By building a thesaurus we could then normalize these to a single term for each key endpoint. As expected, some endpoints were specific to a particular trial. These are potentially useful to differentiate trials, but were not included in trial comparison analysis.
We then used the extracted and normalized endpoint terms to build visualizations to facilitate an overview of the trial landscape and comparisons between trials.
Different visualizations are better suited to answer different types of questions. To look for developments over time we can build a bubble chart or a clustered column chart to look at endpoints by trial start year. Are regulatory differences between countries reflected in trial endpoints for trials? We can create world maps to compare different endpoints. Is endpoint timing consistent? We can plot endpoint timing by sponsor to check. Using a trial timeline we can focus on a particular drug or sponsor to see if endpoint differences reflect a possible change in strategy.
Trial endpoints are key to both trial planning and competitive intelligence. But, the lack of controlled vocabulary for endpoints and endpoint timing on both US and EU trial registries presents a challenge for useful analysis. A manual approach requires reviewing records for trials from different registries looking for relevant endpoint text. We successfully used software tools to identify established endpoints from unstructured text and created visualizations to support endpoint analysis for clinical trial planning, regulatory strategy, and competitive intelligence.