Mixed-Integer Projections for Automated Data Correction of EMRs Improve
Predictions of Sepsis among Hospitalized Patients
- URL: http://arxiv.org/abs/2308.10781v1
- Date: Mon, 21 Aug 2023 15:14:49 GMT
- Title: Mixed-Integer Projections for Automated Data Correction of EMRs Improve
Predictions of Sepsis among Hospitalized Patients
- Authors: Mehak Arora, Hassan Mortagy, Nathan Dwarshius, Swati Gupta, Andre L.
Holder, Rishikesan Kamaleswaran
- Abstract summary: We introduce an innovative projections-based method that seamlessly integrates clinical expertise as domain constraints.
We measure the distance of corrected data from the constraints defining a healthy range of patient data, resulting in a unique predictive metric we term as "trust-scores"
We show an AUROC of 0.865 and a precision of 0.922, that surpasses conventional ML models without such projections.
- Score: 7.639610349097473
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine learning (ML) models are increasingly pivotal in automating clinical
decisions. Yet, a glaring oversight in prior research has been the lack of
proper processing of Electronic Medical Record (EMR) data in the clinical
context for errors and outliers. Addressing this oversight, we introduce an
innovative projections-based method that seamlessly integrates clinical
expertise as domain constraints, generating important meta-data that can be
used in ML workflows. In particular, by using high-dimensional mixed-integer
programs that capture physiological and biological constraints on patient
vitals and lab values, we can harness the power of mathematical "projections"
for the EMR data to correct patient data. Consequently, we measure the distance
of corrected data from the constraints defining a healthy range of patient
data, resulting in a unique predictive metric we term as "trust-scores". These
scores provide insight into the patient's health status and significantly boost
the performance of ML classifiers in real-life clinical settings. We validate
the impact of our framework in the context of early detection of sepsis using
ML. We show an AUROC of 0.865 and a precision of 0.922, that surpasses
conventional ML models without such projections.
Related papers
- Machine Learning for ALSFRS-R Score Prediction: Making Sense of the Sensor Data [44.99833362998488]
Amyotrophic Lateral Sclerosis (ALS) is a rapidly progressive neurodegenerative disease that presents individuals with limited treatment options.
The present investigation, spearheaded by the iDPP@CLEF 2024 challenge, focuses on utilizing sensor-derived data obtained through an app.
arXiv Detail & Related papers (2024-07-10T19:17:23Z) - Multimodal Pretraining of Medical Time Series and Notes [45.89025874396911]
Deep learning models show promise in extracting meaningful patterns, but they require extensive labeled data.
We propose a novel approach employing self-supervised pretraining, focusing on the alignment of clinical measurements and notes.
In downstream tasks, including in-hospital mortality prediction and phenotyping, our model outperforms baselines in settings where only a fraction of the data is labeled.
arXiv Detail & Related papers (2023-12-11T21:53:40Z) - MedLens: Improve Mortality Prediction Via Medical Signs Selecting and
Regression [4.43322868663347]
Data-quality problem of original clinical signs is less discussed in the literature.
We designed MEDLENS, with an automatic vital medical signs selection approach via statistics and a flexible approach for high missing rate time series.
It achieves a very high accuracy performance of 0.96 AUC-ROC and 0.81 AUC-PR, which exceeds the previous benchmark.
arXiv Detail & Related papers (2023-05-19T15:28:02Z) - FineEHR: Refine Clinical Note Representations to Improve Mortality
Prediction [3.9026461169566673]
Large-scale electronic health records provide machine learning models with an abundance of clinical text and vital sign data.
Despite the emergence of advanced Natural Language Processing (NLP) algorithms for clinical note analysis, the complex textual structure and noise present in raw clinical data have posed significant challenges.
We propose FINEEHR, a system that utilizes two representation learning techniques, namely metric learning and fine-tuning, to refine clinical note embeddings.
arXiv Detail & Related papers (2023-04-24T02:42:52Z) - Large Language Models for Healthcare Data Augmentation: An Example on
Patient-Trial Matching [49.78442796596806]
We propose an innovative privacy-aware data augmentation approach for patient-trial matching (LLM-PTM)
Our experiments demonstrate a 7.32% average improvement in performance using the proposed LLM-PTM method, and the generalizability to new data is improved by 12.12%.
arXiv Detail & Related papers (2023-03-24T03:14:00Z) - Clinical Deterioration Prediction in Brazilian Hospitals Based on
Artificial Neural Networks and Tree Decision Models [56.93322937189087]
An extremely boosted neural network (XBNet) is used to predict clinical deterioration (CD)
The XGBoost model obtained the best results in predicting CD among Brazilian hospitals' data.
arXiv Detail & Related papers (2022-12-17T23:29:14Z) - Benchmarking Heterogeneous Treatment Effect Models through the Lens of
Interpretability [82.29775890542967]
Estimating personalized effects of treatments is a complex, yet pervasive problem.
Recent developments in the machine learning literature on heterogeneous treatment effect estimation gave rise to many sophisticated, but opaque, tools.
We use post-hoc feature importance methods to identify features that influence the model's predictions.
arXiv Detail & Related papers (2022-06-16T17:59:05Z) - Modeling Disagreement in Automatic Data Labelling for Semi-Supervised
Learning in Clinical Natural Language Processing [2.016042047576802]
We investigate the quality of uncertainty estimates from a range of current state-of-the-art predictive models applied to the problem of observation detection in radiology reports.
arXiv Detail & Related papers (2022-05-29T20:20:49Z) - Bootstrapping Your Own Positive Sample: Contrastive Learning With
Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model.
We introduce two unique positive sampling strategies specifically tailored for EHR data.
Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z) - UNITE: Uncertainty-based Health Risk Prediction Leveraging Multi-sourced
Data [81.00385374948125]
We present UNcertaInTy-based hEalth risk prediction (UNITE) model.
UNITE provides accurate disease risk prediction and uncertainty estimation leveraging multi-sourced health data.
We evaluate UNITE on real-world disease risk prediction tasks: nonalcoholic fatty liver disease (NASH) and Alzheimer's disease (AD)
UNITE achieves up to 0.841 in F1 score for AD detection, up to 0.609 in PR-AUC for NASH detection, and outperforms various state-of-the-art baselines by up to $19%$ over the best baseline.
arXiv Detail & Related papers (2020-10-22T02:28:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.