Pharm Exec speaks to Mark Lambrecht, Director of the Health and Life Sciences Global Practice at SAS, about the growing role of big data analytics in pharma and how the industry is evolving to keep pace with it.
Pharm Exec speaks to Mark Lambrecht, Director of the Health and Life Sciences Global Practice at SAS, about the growing role of big data analytics in pharma and how the industry is evolving to keep pace with it.
Mark Lambrecht
ark Lambrecht: Many companies now have strategies in place to deal with digital health data, and those that are most successful center around the patient and the science itself. They are following where the data is coming from and how it can help them move forward with clinical development or gain an understanding of what value their therapy brings to the market.
The biggest pitfall is where companies develop analytical platforms on an “island”, with data lakes or warehouses being set up without any real purpose established beforehand. When that happens, it’s difficult to ensure that the people who really need those insights from the data can access them. What works better are smaller, iterative and experimental projects, which run with a very specific purpose and respond to specific problems, such as predicting the enrollment of patients in a clinical trial using certain algorithms. This is where the data analysis becomes valuable, because it's going to be used quickly and in a production context. You work your way back from the problem and then ask, “What data do I need and how can I collect that?” Those companies where the scientists and the technologists are working together on this are the most successful.
A couple of years ago, companies began establishing in-house capabilities and warehouses and started bringing the data in. This is now maturing to the point where organizations understand what techniques they want to use and put in place different architectures and technologies for different purposes. But it’s about more than just a platform; it's also about the ability to govern the lifecycle of the data to make sure the information is always up-to-date and that you can trace the insights back to the original data and validate it. In that sense, I see a lot more realism coming along.
Smaller biotech or pharma companies with smaller budgets to license commercial real-world data sets can be more innovative, because they have to be more creative and look for more automation. Automation is definitely a big driver in making the best use of the data and the best use of AI-whether that’s natural language processing or image analysis or newer techniques. A company may want to use text analytics and natural language processing, for example, to help them detect serious adverse events in pharmacovigilance reports. While big pharma companies have armies of people to manually filter out the meaningful signals, small companies don't have access to that and so they want to deploy AI techniques on those big data sets. That leads to more efficient ways of looking at information and helps ensure that their scientists are looking at high priority problems.
For us, nothing happening around big data is new, because we have always dealt with these types of data. What has changed is the size and variety of data. We have expanded our platform in recent years, so we can now analyze video, image, and text data, and more importantly, combine these information silos with different types of clinical and patient data. And as we’ve seen more organizations using and deploying different types of analytics, we understand that we need to work in a hybrid environment, where we embrace different technologies, cloud and open source programming languages, offering the control and choice when a user needs it.
But it’s not just about the different types of data, it's also the ability to compare different datasets. As more data becomes available that requires analysis, we keep investing and evolving our capabilities. There's a lot of data in isolation that is not standardized; this can be the biggest hindrance to deploying AI in analytics broadly. Pulling that data into a unifying platform where it can be controlled and governed is where we feel we make a tangible difference for the industry.
I think there are two conflicting trends. There is more policy and more data than ever before, and if you look at the different healthcare systems that pharma companies are interested in, there's a lot of divergence. Some countries are way ahead in standardizing their healthcare data; others have more fragmented digital healthcare systems. Variation in how data is structured and how it relates to one another is a big problem. The absence of mature healthcare standards is a big problem. There are mitigating techniques-knowledge graphs, linked data-but, unfortunately, these do not solve everything, and they have their own inherent limitations. The likely solution here will come from regulatory, technical and domain standards bodies like HL7, NCI, WHO and CDISC working together more strongly.
But the other positive trend is that globalizing policies are also driving harmonization. One example is the EU’s GDPR, which is having an effect globally in the way that the people are thinking about patient privacy. One of the downstream effects will be data harmonization and standardization. The FAIR (Findability, Accessibility, Interoperability, and Reusability) Data principles that were initiated in academia are also becoming more important. Pharma organizations and industry representative bodies like EFPIA are doing a lot to make their clinical trial data available for external academic researchers. But there's a lot of work to be done in making data available across different stakeholders in the healthcare ecosystems and increase collaboration.
In a couple of years, no clinical trial will be run without first consulting real-world data. It will be part of the whole clinical development effort, from modeling and simulation and predicting where you need to go with your trial in terms of geography and therapeutic area, to really understanding what the medicine under investigation does for patients. Using that kind of real-world data, generated from an abundance of sources -video data, demographic data, claims data, financial data-will help companies understand more about the average patient and create therapies that improve patient outcomes and are impactful for society.
Another trend I anticipate will involve more data from n-of-1 trials, or single-patient trials, which are randomised controlled crossover trials in a single patient. They investigate the efficacy or side-effect profiles of different interventions, with the goal determining the optimal or best intervention for an individual patient using objective data-driven criteria. The goal is that every patient can have a specific therapy geared towards their genetic background and their disease, and patients can be matched with the right trial and the right therapies. These n-of-1 trials offer the opportunity to gather a lot of data-genomics data, proteomics data-and do a lot of real-time analytics.
One more thing to watch is the application of AI and analytics to streaming data, such as video. Streaming analytics will become more pervasive. Take a hospital that is using a robot for a surgical procedure, for example. A lot of video data comes from that procedure. Streaming analytics can be applied to that video data to help support the physician as he or she performs the surgery. In a similar manner, streaming analytics can be used for clinical development. It will help companies to trim down and keep just the data and information that is relevant. Not every data point is valuable and needs to be stored for eternity.