T-26: Case Study: Computing Complexity Scores to Identify Patients of Interest from Forums for Safety and Beyond

      Amy L. Curry

      • Safety Evaluation and Risk Management Scientist
      • GlaxoSmithKline
        United States


Share a complexity scoring method used to analyze threaded forum data for two disease areas, in order to identify patients of interest for constructing disease journeys and investigating patient-related insights.


Deidentified data from 01Jan2015 – 01Nov2015 for 2,817 threads (21,313 posts) by 3,601 unique authors were reviewed. Linked records were created to follow author activity across the forum. To distinguish patients of interest, a complexity score composed of 28 indicators was computed for all authors.


Social media posts from a patient centric forum on rheumatoid arthritis (RA) and scleroderma (SS) (01Jan2015 – 01Nov2015) were reviewed to investigate the patient-disease journey. Data were deidentified; however, authors were assigned unique identifiers to link their activity in discussions threads. A random sample of 2,817 threads (50.2%) consisting of 21,313 posts (34%) were reviewed, of which 5,621 (26.4%) posts were medically relevant. Of the 21,313 posts, 5,559 (26%) were identified as being authored by the patient, 351 (1.6%) by family members, 15,342 (72%) were unknown, while the remaining posts were made by healthcare providers, caregivers and friends (< 1%). Patient diagnosis was classified by the following hierarchy: “yes, both”, “yes, RA”, “yes, SS”, “probable, RA”, “probable, SS”, “yes, other”. Since a disease diagnosis might not be ascertainable from a single post, we looked over the entire history of an author’s activity in forum discussions and selected the highest diagnosis identified during review for categorizing patients of interest. To distinguish patients of interest, a complexity score composed of 28 indicators was computed. Each indicator contributed a weight of (1) to the complexity score (e.g. disease burden, disability status, adherence concerns and socio-economic status). Two rare indicators, disease duration or participation in a clinical trial, were assigned a weight of (2). 15 patients of interest were identified using the following criteria: a score greater than or equal to 14, and a minimum of four posts across discussion threads. These criteria could be adjusted to capture a wider or narrower range of patients. Complexity score results ranged as follows “yes, both” (1-20), “yes, RA” (1-16), “yes, SS” (1-15), “probable, RA” (1-8), “probable, SS” (1-8), “yes, other” (1-8).The highest complexity score we encountered was 20, and belonged to a patient with both RA and SS.


Threaded data from can be leveraged to investigate patient insights including author designation (e.g. patient, family member, etc.), diagnosis, disease duration, clinical trial participation, disease burden and disability status as they relate to two autoimmune diseases of interest. By having threaded data it was possible to ascertain the highest level of disease diagnosis (e.g. awaiting diagnosis, diagnosed, etc.) and to follow a patient’s voice through the forum as he/she progressed along a disease journey. We were able to identify patients of interest by computing and applying a weighted complexity score. This score correlated to the richness of an author’s cumulative posting record through the online discussion forum. By more closely examining the respective posts within threads for the identified patients of interest (i.e. those authors with the highest complexity scores and at least 4 posts across threads), we concluded that there is sufficient content to create patient disease journeys which would be helpful not only to Global Clinical Safety and Pharmacovigilance but to other groups in GlaxoSmithKline. Additional research is necessary to more efficiently construct and leverage disease journeys for identified patients of interest. Co-authors: Thomas M (1), Curry A (2), Painter J(3), Akhtar A(4) , Schifano L(2), Powell GE(2) (1) GlaxoSmithKline, Collegeville, PA, USA; (2) GlaxoSmithKline, Research Triangle Park, NC, USA; (3) JiveCast, Raleigh, NC, USA; (4) ZeroChaos, Orlando, FL, USA