How pharma companies are leveraging natural language processing (NLP) for rapid and effective handling of unstructured text.
Safety issues around drugs can appear at any stage, from initial lead candidate through preclinical and clinical development to post-marketing surveillance. And, at every stage, critical data is being both generated and sought from unstructured text-from internal safety reports, scientific literature, individual case safety reports, clinical investigator brochures, patient forums, social media, and conference abstracts. Intelligent search across these hundreds of thousands of pages can provide the information for key decision support.
And this is where artificial intelligence (AI) technologies come in. One of the key tools for rapid and effective handling of unstructured text is natural language processing (NLP). NLP is an AI technology that enables written text to be interpreted and rapidly transforms the key content in text documents into quantitative, actionable insights. These structured data can be rapidly visualized and analyzed at every stage through the safety lifecycle of a drug.
Merck MSD for non-clinical safety
At Merck MSD, the Safety Assessment and Laboratory Animals Resources (SALAR) division helps advance high-quality drug candidates into development by defining the non-clinical safety and selectivity of lead compounds. Merck uses NLP technologies in an automated workflow to extract unstructured conclusions and interpretations from final study reports, antemortem reports, post-mortem reports and protocols stored in a Documentum-based electronic official file repository. NLP rules are able to identify, extract, and normalize study annotation metadata and organ pathology findings. The results are combined with structured output, loaded into a SALAR knowledgebase, and visualized via dashboards for the safety assessment teams. A recent paper on the work concluded that this text mining pipeline enabled the authors to more effectively identify toxicities that were not seen in three-month studies, but became apparent only in chronic testing.
Agios Pharmaceuticals use for clinical safety
At Agios, Director of Informatics Stuart Murray has been using NLP technologies to get decision support as fast and as comprehensively as possible. “We’ve used NLP from very early exploratory research to discover targets for our pipeline through to pre-clinical development looking for safety signals, and now most recently for pharmacovigilance to understand what is going on in our clinical trials.” In clinical safety workflows, NLP is being used in to mine adverse event reports, extract case-data from call center records, and assist with initial coding of reported events and WHO drugs. A recent use case explored the risk of a rare (and potentially life-threatening) adverse event Differentiation Syndrome in patients on trial of Agios’s IDH1-inhibitor AG120. The extracted data was visualized as networks in Cytoscape and enabled clinicians to explore the patterns of symptoms between patients to critically identify those at risk.
Pharmacovigilance at GlaxoSmithKline
The increasing prevalence of real-world data for pharmacovigilance and post-market surveillance provides pharma companies and healthcare organizations a rich seam of data to monitor and mine. Regulations insist that all pharma companies monitor the scientific literature regularly to search for such things as possible adverse events that aren’t reported on drug labels.
At GlaxoSmithKline, for example, clinical safety team members regularly review medical literature to identify relevant safety signals. They calculated that a daily monitor of literature for just 20 marketed products typically reveals an average of 60 new references per day, and that each abstract takes 1.2 to 1.6 minutes to manually review-for a total of about one or more hours each day. A typical pharma company may have 200 marketed products in its portfolio, so the review process for pharmacovigilance can be hugely time consuming for most organizations.
GlaxoSmithKline researchers are leveraging NLP to boost search efficiencies. Instead of manually searching documents for adverse events, the company uses NLP tools to find events in minutes versus hours. In addition, GlaxoSmithKline uses linguistic processing to improve search specificity while identifying appropriate relationships between a drug and an adverse event.
For example, in a single manual search to find adverse events associated with the selective androgen receptor modifier Enobosarm (an investigational drug also known as MK-2866 or Ostarine), GlaxoSmithKline researchers pulled 132 abstracts. After a three-hour manual review, researchers found that only about 30% of the abstracts were relevant and actually described an association with an adverse event. A similar search using NLP tools took just minutes and provided a structured results table for rapid final review.
Safety is assessed at all stages in the life history of a drug. The ultimate test comes only after the drug has been marketed and used in a clinical setting in many thousands of patients, across broader indications, and in combinations with other drugs. There will always be risks; but the more data that can be gathered, analyzed, and transformed into actionable information, the greater the chance of lowering that risk. Terminating an unsafe drug project as early as possible in the discovery pipeline can save time, money, and possibly lives. Who can argue with those benefits?
Jane Z. Reed, PhD. is Head of Life Science Strategy, Linguamatics, a provider of NLP services.