Introducing the Large Medical Model: State of the art healthcare cost and risk prediction with transformers trained on patient event sequences
- URL: http://arxiv.org/abs/2409.13000v1
- Date: Thu, 19 Sep 2024 15:38:21 GMT
- Title: Introducing the Large Medical Model: State of the art healthcare cost and risk prediction with transformers trained on patient event sequences
- Authors: Ricky Sahu, Eric Marriott, Ethan Siegel, David Wagner, Flore Uzan, Troy Yang, Asim Javed,
- Abstract summary: The Large Medical Model (LMM) is a generative pre-trained transformer (GPT) designed to guide and predict the broad facets of patient care and healthcare administration.
The model is trained on medical event sequences from over 140M longitudinal patient claims records with a specialized vocabulary built from medical terminology systems.
The LMM is able to improve both cost prediction by 14.1% over the best commercial models and chronic conditions prediction by 1.9% over the best transformer models in research predicting a broad set of conditions.
- Score: 0.47901560316389713
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: With U.S. healthcare spending approaching $5T (NHE Fact Sheet 2024), and 25% of it estimated to be wasteful (Waste in the US the health care system: estimated costs and potential for savings, n.d.), the need to better predict risk and optimal patient care is evermore important. This paper introduces the Large Medical Model (LMM), a generative pre-trained transformer (GPT) designed to guide and predict the broad facets of patient care and healthcare administration. The model is trained on medical event sequences from over 140M longitudinal patient claims records with a specialized vocabulary built from medical terminology systems and demonstrates a superior capability to forecast healthcare costs and identify potential risk factors. Through experimentation and validation, we showcase the LMM's proficiency in not only in cost and risk predictions, but also in discerning intricate patterns within complex medical conditions and an ability to identify novel relationships in patient care. The LMM is able to improve both cost prediction by 14.1% over the best commercial models and chronic conditions prediction by 1.9% over the best transformer models in research predicting a broad set of conditions. The LMM is a substantial advancement in healthcare analytics, offering the potential to significantly enhance risk assessment, cost management, and personalized medicine.
Related papers
- Reasoning-Enhanced Healthcare Predictions with Knowledge Graph Community Retrieval [61.70489848327436]
KARE is a novel framework that integrates knowledge graph (KG) community-level retrieval with large language models (LLMs) reasoning.
Extensive experiments demonstrate that KARE outperforms leading models by up to 10.8-15.0% on MIMIC-III and 12.6-12.7% on MIMIC-IV for mortality and readmission predictions.
arXiv Detail & Related papers (2024-10-06T18:46:28Z) - MedDiffusion: Boosting Health Risk Prediction via Diffusion-based Data
Augmentation [58.93221876843639]
This paper introduces a novel, end-to-end diffusion-based risk prediction model, named MedDiffusion.
It enhances risk prediction performance by creating synthetic patient data during training to enlarge sample space.
It discerns hidden relationships between patient visits using a step-wise attention mechanism, enabling the model to automatically retain the most vital information for generating high-quality data.
arXiv Detail & Related papers (2023-10-04T01:36:30Z) - Building predictive models of healthcare costs with open healthcare data [0.0]
We present an approach to developing a predictive model using machine-learning techniques.
We analyzed de-identified patient data from New York StateS, consisting of 2.3 million records in 2016.
We built models to predict costs from patient diagnoses and demographics.
arXiv Detail & Related papers (2023-04-05T02:12:58Z) - Foresight -- Deep Generative Modelling of Patient Timelines using
Electronic Health Records [46.024501445093755]
Temporal modelling of medical history can be used to forecast and simulate future events, estimate risk, suggest alternative diagnoses or forecast complications.
We present Foresight, a novel GPT3-based pipeline that uses NER+L tools (i.e. MedCAT) to convert document text into structured, coded concepts.
arXiv Detail & Related papers (2022-12-13T19:06:00Z) - Predicting Treatment Adherence of Tuberculosis Patients at Scale [0.6873562466909032]
Non-adherence to TB medication is a significant cause of mortality and morbidity.
We formulate and solve the machine learning problem of early prediction of non-adherence based on a custom rank-based metric.
Our findings indicate that risk stratification of non-adherent patients is a viable, deployable-at-scale ML solution.
arXiv Detail & Related papers (2022-11-05T17:00:21Z) - Advances in Prediction of Readmission Rates Using Long Term Short Term
Memory Networks on Healthcare Insurance Data [1.454498931674109]
30-day hospital readmission is a long standing medical problem that affects patients' morbidity and mortality and costs billions of dollars annually.
We developed a bi-directional Long Short Term Memory (LSTM) Network that is able to use readily available insurance data.
Our results demonstrate that a machine learning model is able to predict risk of inpatient readmission with reasonable accuracy for all patients.
arXiv Detail & Related papers (2022-06-30T19:07:10Z) - Predicting Patient Readmission Risk from Medical Text via Knowledge
Graph Enhanced Multiview Graph Convolution [67.72545656557858]
We propose a new method that uses medical text of Electronic Health Records for prediction.
We represent discharge summaries of patients with multiview graphs enhanced by an external knowledge graph.
Experimental results prove the effectiveness of our method, yielding state-of-the-art performance.
arXiv Detail & Related papers (2021-12-19T01:45:57Z) - AttDMM: An Attentive Deep Markov Model for Risk Scoring in Intensive
Care Units [20.96242356493069]
We propose a novel generative deep probabilistic model for real-time risk scoring in ICUs.
To the best of our knowledge, AttDMM is the first ICU prediction model that jointly learns both long-term disease dynamics (via attention) and different disease states in health trajectory.
Our model shows a path towards identifying patients at risk so that health practitioners can intervene early and save patient lives.
arXiv Detail & Related papers (2021-02-09T08:44:31Z) - Clinical Outcome Prediction from Admission Notes using Self-Supervised
Knowledge Integration [55.88616573143478]
Outcome prediction from clinical text can prevent doctors from overlooking possible risks.
Diagnoses at discharge, procedures performed, in-hospital mortality and length-of-stay prediction are four common outcome prediction targets.
We propose clinical outcome pre-training to integrate knowledge about patient outcomes from multiple public sources.
arXiv Detail & Related papers (2021-02-08T10:26:44Z) - UNITE: Uncertainty-based Health Risk Prediction Leveraging Multi-sourced
Data [81.00385374948125]
We present UNcertaInTy-based hEalth risk prediction (UNITE) model.
UNITE provides accurate disease risk prediction and uncertainty estimation leveraging multi-sourced health data.
We evaluate UNITE on real-world disease risk prediction tasks: nonalcoholic fatty liver disease (NASH) and Alzheimer's disease (AD)
UNITE achieves up to 0.841 in F1 score for AD detection, up to 0.609 in PR-AUC for NASH detection, and outperforms various state-of-the-art baselines by up to $19%$ over the best baseline.
arXiv Detail & Related papers (2020-10-22T02:28:11Z) - Deep learning for prediction of population health costs [0.0]
We developed a deep neural network to predict future cost from health insurance claims records.
We applied the deep network and a ridge regression model to a sample of 1.4 million German insurants to predict total one-year health care costs.
arXiv Detail & Related papers (2020-03-06T23:33:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.