Predictive analytics is a branch of advanced analytics used to make predictions about unknown future events. Predictive analytics uses techniques from data mining, statistics, modeling, machine learning, and artificial intelligence to analyze current data to make predictions. Those predictions are based on patterns found in historical and transactional data in order to identify possible opportunities, trends, and risks in the future. Through predictive analytics, organizations work to increase their revenue, increase or create a competitive advantage, and develop insights to improve economic performance and develop organizational differentiation.
In an example, such as energy load forecasting used to predict energy demand, the workflow to develop a predictive analytics model uses vast amounts of data from energy producers, grid operators, historical load data, and energy traders. The data is then cleaned to remove outliers, data spikes, missing data, or anomalous points to create a unified energy load, temperature, and dew point. Then the data is used to develop an accurate predictive model using statistics, curve fitting tools, or machine learning in order to build and train a predictive model and a related neural network. New data can then be used to test how well a predictive model performs. Finally, the model can be integrated into a load forecasting system in a production environment to understand the accuracy of the model, and once tuned, can be used in production and forecasting systems.
Predictive analytics are not a monolith; there are different models developed for design-specific functions. These include forecast models, classification models, outlier models, time series models, and clustering models.
Predictive analytic model types
Predictive models are used for all kinds of applications, including weather forecasting, creating challenging and engaging video games, and translating voice to text for mobile phone messaging. These applications use statistical models of existing data to make predictions about future data. Descriptive models can determine relationships, patterns, and structures in data that can draw conclusions about how changes in the underlying processes that generate the data will change the results. Predictive models build on these descriptive models to determine the likelihood of certain future outcomes given current conditions or a set of expected future conditions. Predictive models have been used in industries such as finance, healthcare, pharmaceuticals, automotive, aerospace, and manufacturing.
Predictive analytics use cases
Predictive analytics and its application has in some cases been criticized and legally restricted. This has largely been in cases where the use of predictive analytics has resulted in perceived inequities of its outcomes. This has most commonly been in situations when predictive models have resulted in statistical discrimination against racial or ethnic groups in areas such as credit scoring, home lending, employment, or risk of criminal behavior. A famous, and illegal, example of this is the practice of redlining in home lending by banks. Because of incidents and practices such as redlining, any predictive analytics models that include information such as a person's race are now often excluded from predictive analytics.
Much of medicine and healthcare is about anticipating and reducing risk based on current and historical patient data. Clinicians are required to make decisions without absolute certainty, but with the advance of predictive analytics in healthcare, these decisions offer the promise of being better performed than previously. Built upon the growing sophistication of big data analytics capabilities, predictive analytics can take the patterns in historical data for predictive outcomes and allow clinicians to alert patients about possible health concerns in the near future. This can be important in cases such as intensive care, surgery, or emergency care where quick reactions and sensitivity to something wrong can save lives.
Predictive analytics can help answer questions about the best treatment for a patient, the likelihood of a patient to experience adverse events following a given procedure, and the likelihood that the patient has a given disease. This can be used at various points in the patient journey:
- Diagnosis—predictive analytics have been used to predict malignant mesothelioma diagnosis in a patient cohort. Patients diagnosed early can start treatment sooner and improve their overall chances for survival.
- Prognosis—researchers have used predictive analytics on physiological data from patients with congestive heart failure to predict which patients were at greatest risk of readmission following a hospital stay. Using that information, physicians could implement interventions early to prevent the predicted readmissions.
- Treatment—clinicians have used machine learning-based predictive analytical models to determine the most effective course of treatment for chronic pain patients.
Using artificial intelligence and machine learning, predictive models can intake huge amounts of diverse data for a patient and forecast a patient's response to certain treatments or devices, their risk of developing a specific disease, and their prognosis for a given condition. Predictive analytics can also offer a chance to develop personalized healthcare, where the treatment of a patient can be developed from the individual's medical history, environment, social risk factors, genetics, and unique biochemistry, among other characteristics. The key to personalized healthcare is treating a patient based on their specific attributes, instead of relying on population averages that do not serve all patients. This can also push healthcare towards treating a patient as an individual rather than an average and improve overall patient care.
Furthermore, once a patient is being treated, predictive analytics can alert clinicians and caregivers of the likelihood of events and outcomes before they occur, helping healthcare professionals prevent as much as cure health issues. Driven by artificial intelligence and data derived from the Internet of Things (IoT) device monitoring patients, algorithms fed with historical and real-time data can make meaningful predictions. Such predictive algorithms can be used to support clinical decision making for an individual patient, and to inform interventions on a cohort or population level. This can also be applied to hospitals operational and administrative challenges.
Ways predictive analytics is being used in healthcare
As AI and machine learning has been developed for predictive analytics in healthcare, there have been some examples of predictive analytics used in studies for understanding the possible applications. Some of these were done during the COVID-19 pandemic, including:
- COViage, a software prediction system, assessing whether hospitalized COVID-19 patients are at a high risk of needing intubation.
- CLEWICU System, a prediction software that works to identify which ICU COVID-19 patients are at risk for respiratory failure or low blood pressure.
- Mount Sinai Health System's AI model that analyzes computed tomography (CT) scans of the chest with patient data to rapidly detect COVID-19.
- Researchers at the University of Minnesota, along with Epic Systems and M Health Fairview, developed an AI tool capable of evaluating chest x-rays to diagnose a possible case of COVID-19.
Previous to COVID-19, other examples of some uses of predictive analytics in healthcare included:
- The University of Pennsylvania has developed a predictive analytics tool that uses machine learning and EHR data to identify patients on track for severe sepsis or septic shock twelve hours before the onset of the condition.
- A predictive model developed in a study by Duke University saw that clinic-level data could capture an additional 4800 patient no-shows per year with higher accuracy than previous attempts to forecast patient patterns.
- UnityPoint Health, a network of healthcare facilities, aggregated answers to a questionnaire of why patients were being readmitted to develop a predictive model was able to assign a readmission risk to every visiting patient.
- Diabetes Care published a study demonstrating that predictive analytics models for healthcare can determine a five to ten years life expectancy for older adults with diabetes and allowed doctors to craft treatment plans for individual patients.
- A research team at Vanderbilt University Medical Center (VUMC) developed a predictive analytics model using patients' EHR to forecast the likelihood of suicide attempts by particular patients. Through an eleven month testing period, the patients were classified into eight groups based on their risk factor, of which the highest-risk group accounted for over 33 percent of suicide attempts.
The FDA pathway for medical devices using artificial intelligence (AI) and machine learning (ML) for medical decision making and data analysis is stringent, with difficult regulatory requirements for medical device licensing. This process is rigorous and time and resource consuming and has been considered a pivotal barrier in the introduction of AI and ML in medicine. Before any medical hardware or software are made legally available, the parent company has to submit it to the FDA for evaluation. For medically oriented AI and ML-based algorithms, the regulatory body has three levels of clearance: 510(k), premarket approval, and the de novo pathway, each which has specific required criteria.
Types of FDA approvals for AI/ML-based medical technology
In the case where companies work to update the algorithm with a product, the FDA considers the update as a new product and requires the update to receive approval in the same way as the original product. There has been a realization on behalf of the FDA that the process might be impossible to maintain, so the FDA has begun to consider a total product lifecycle-based regulatory framework. This framework would allow for modifications to be made from real-world learning and adaptation, while still ensuring that the safety and effectiveness of the software as a medical device are maintained.
While the FDA considers the possibility of changing the process for approving medical devices using predictive analytics, the FDA has cleared or approved several medical devices using "locked" algorithms. The "locked" algorithm has been defined as an algorithm that defines the same result each time the same input is applied to it and does not change. But the promise for medical devices using algorithms is that they will adapt over time to improve in accuracy and potential applications. These are described by the FDA as adaptive algorithms, and there is no regulatory framework designed for them.
An attempt to change the FDA's proposed regulatory framework from 2019 that elaborates on potential approaches to premarket review for AI and ML-based software modifications. The FDA has also recognized that the adaptive algorithms require a total product lifecycle regulatory approach (TPLC), enabling a rapid cycle of product improvement while maintaining effective safeguards. This TPLC approach is based on the Digital Health Software Precertification (Pre-Cert) Program, allowing for the evaluation of software as a medical device (SaMD) products throughout the lifecycle.