How Life Sciences is Leveraging Machine Learning: 5 Use Cases in 5 Minutes

By Mike Munsell, PhD

It’s no secret that machine learning is establishing itself as a powerful tool in the healthcare analytics industry. However, there is still some uncertainty around how to employ it. In our editions of Insights & Innovators: 5 Machine Learning Use Cases in 5 Minutes, we will take a look at five recent studies that used machine learning (ML) to analyze real-world data.

In this edition of 5 in 5: 5 Machine Learning Use Cases in 5 Minutes, I’m bringing you a variety of use cases, including identifying risk factors for HIV in underrepresented populations, determining time to treatment failure in heart failure, identifying lung cancer risk among younger adults, predicting in-hospital delirium, and evaluating a patient’s risk for stroke recurrence based on routinely collected data. Different use cases, but all leveraging the power of ML to examine RWD and generate predictive insights. Read on.

  1. Duke University and Harvard Medical School used machine learning to develop a predictive model for HIV risk that identified important risk factors and performs well among populations that are typically underrepresented in HIV data, including female patients. Risk scores from the model may help to better identify patients that could benefit from HIV pre-exposure prophylaxis (PrEP) in areas with high incident infections, particularly among females, such as the U.S. South.
  2. Washington University in St. Louis School of Medicine used EHR data to train a deep learning model that was able to accurately identify time to treatment non-response (death or severe decompensation) among patients with heart failure. The algorithm makes use of vital signs, laboratory results, and medication data over time and could help clinicians identify candidates for surgical intervention earlier and more precisely.
  3. Johnson & Johnson & University of Pennsylvania developed a machine learning model with the ability to identify young patients (45 – 65 years of age) at high-risk of being diagnosed with lung cancer within 3 years. The model, which was trained on a large real-world dataset and validated with EHR data from a Midwestern healthcare system, could improve the ability of clinicians to identify and screen individuals at early stages of the disease.
  4. Vanderbilt University generated machine learning models that were able to predict delirium within 6 hours of onset among hospitalized adult patients with higher accuracy than traditional statistical methods. The algorithm, which was trained using real-world EHR data, could aid in predicting delirium earlier than current confusion assessment methods, improving outcomes among hospitalized patients.
  5. Novartis used machine learning and data from the Erlangen Stroke Registry to develop a predictive model for evaluating a patient’s risk of recurrent stroke within one year of discharge. In addition to providing improved predictive performance relative to standard risk score assessments, such as the Essen Stroke Risk Score, the model only includes factors that can be obtained during routine home-based care. This is in contrast to other predictive models, which rely on detailed clinical or imaging data and may be difficult to implement in practice.

As you can see, there is no shortage of use cases for machine learning, especially when it comes to healthcare analytics. Panalgo’s IHD Data Science module is a powerful machine learning analytics tool built on our self-service, point-and-click platform, that allows users to uncover new insights and produce more accurate prediction and segmentation models, similar to the analyses outlined above. If you would like to learn more about how you can leverage machine learning analytics with the IHD Data Science module, contact us today.


Mike Munsell, PhD, is Director of Research at Panalgo, where he manages the research agenda for scientific dissemination and software development in a variety of fields including health economics, data science/machine learning, and epidemiology. Before becoming Director of Research, Mike was a data scientist at Panalgo, working with engineering teams to prototype, code, and validate new machine learning models and features for Panalgo’s IHD Data Science platform. He has over 10 years of experience as a health economist and data scientist and is a published thought leader in the space.