Access to care improves EHR reliability and clinical risk prediction model performance
- URL: http://arxiv.org/abs/2412.07712v2
- Date: Fri, 13 Dec 2024 22:46:36 GMT
- Title: Access to care improves EHR reliability and clinical risk prediction model performance
- Authors: Anna Zink, Hongzhou Luan, Irene Y. Chen,
- Abstract summary: Using an All of Us dataset of 134,513 participants, we investigate the effects of access to care on the medical machine learning pipeline.<n>Our findings reveal that patients with cost constrained or delayed care have worse EHR reliability as measured by patient self-reported conditions.<n>These findings provide the first large-scale evidence that healthcare access systematically affects both data reliability and clinical prediction performance.
- Score: 1.5020330976600735
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Disparities in access to healthcare have been well-documented in the United States, but their effects on electronic health record (EHR) data reliability and resulting clinical models are poorly understood. Using an All of Us dataset of 134,513 participants, we investigate the effects of access to care on the medical machine learning pipeline, including medical condition rates, data quality, outcome label accuracy, and prediction performance. Our findings reveal that patients with cost constrained or delayed care have worse EHR reliability as measured by patient self-reported conditions for 78% of examined medical conditions. We demonstrate in a prediction task of Type II diabetes incidence that clinical risk predictive performance can be worse for patients without standard care, with balanced accuracy gaps of 3.6 and sensitivity gaps of 9.4 percentage points for those with cost-constrained or delayed care. We evaluate solutions to mitigate these disparities and find that including patient self-reported conditions improved performance for patients with lower access to care, with 11.2 percentage points higher sensitivity, effectively decreasing the performance gap between standard versus delayed or cost-constrained care. These findings provide the first large-scale evidence that healthcare access systematically affects both data reliability and clinical prediction performance. By revealing how access barriers propagate through the medical machine learning pipeline, our work suggests that improving model equity requires addressing both data collection biases and algorithmic limitations. More broadly, this analysis provides an empirical foundation for developing clinical prediction systems that work effectively for all patients, regardless of their access to care.
Related papers
- Revealing Treatment Non-Adherence Bias in Clinical Machine Learning Using Large Language Models [3.452725432283546]
We investigate how treatment non-adherence introduces implicit bias that can distort both causal inference and predictive modeling.
Our findings demonstrate that this implicit bias can not only reverse estimated treatment effects, but also degrade model performance by up to 5%.
This highlights the importance of accounting for treatment non-adherence in developing responsible and equitable clinical machine learning systems.
arXiv Detail & Related papers (2025-02-26T23:30:55Z) - Healthcare cost prediction for heterogeneous patient profiles using deep learning models with administrative claims data [0.0]
This study is grounded in socio-technical considerations that emphasize the interplay between technical systems and humanistic outcomes.
We propose a channel-wise deep learning framework that mitigates data heterogeneity by segmenting AC data into separate channels.
The proposed channel-wise models reduce prediction errors by 23% compared to single-channel models, leading to 16.4% and 19.3% reductions in overpayments and underpayments.
arXiv Detail & Related papers (2025-02-17T19:20:41Z) - Primary Care Diagnoses as a Reliable Predictor for Orthopedic Surgical Interventions [0.10624941710159722]
Referral workflow inefficiencies contribute to suboptimal patient outcomes and higher healthcare costs.
In this study, we investigated the possibility of predicting procedural needs based on primary care diagnostic entries.
arXiv Detail & Related papers (2025-02-06T17:15:12Z) - MedDiffusion: Boosting Health Risk Prediction via Diffusion-based Data
Augmentation [58.93221876843639]
This paper introduces a novel, end-to-end diffusion-based risk prediction model, named MedDiffusion.
It enhances risk prediction performance by creating synthetic patient data during training to enlarge sample space.
It discerns hidden relationships between patient visits using a step-wise attention mechanism, enabling the model to automatically retain the most vital information for generating high-quality data.
arXiv Detail & Related papers (2023-10-04T01:36:30Z) - Forecasting Treatment Response with Deep Pharmacokinetic Encoders [14.900236106367167]
We propose a novel hybrid global-local architecture and a PK encoder that informs deep learning models of patient-specific treatment effects.
We showcase the efficacy of our approach in achieving significant accuracy gains for a blood glucose forecasting task.
arXiv Detail & Related papers (2023-09-22T18:43:41Z) - PRISM: Leveraging Prototype Patient Representations with Feature-Missing-Aware Calibration for EHR Data Sparsity Mitigation [7.075420686441701]
PRISM is a framework that indirectly imputes data by leveraging prototype representations of similar patients.
PRISM also includes a feature confidence module, which evaluates the reliability of each feature considering missing statuses.
arXiv Detail & Related papers (2023-09-08T07:01:38Z) - FineEHR: Refine Clinical Note Representations to Improve Mortality
Prediction [3.9026461169566673]
Large-scale electronic health records provide machine learning models with an abundance of clinical text and vital sign data.
Despite the emergence of advanced Natural Language Processing (NLP) algorithms for clinical note analysis, the complex textual structure and noise present in raw clinical data have posed significant challenges.
We propose FINEEHR, a system that utilizes two representation learning techniques, namely metric learning and fine-tuning, to refine clinical note embeddings.
arXiv Detail & Related papers (2023-04-24T02:42:52Z) - Large Language Models for Healthcare Data Augmentation: An Example on
Patient-Trial Matching [49.78442796596806]
We propose an innovative privacy-aware data augmentation approach for patient-trial matching (LLM-PTM)
Our experiments demonstrate a 7.32% average improvement in performance using the proposed LLM-PTM method, and the generalizability to new data is improved by 12.12%.
arXiv Detail & Related papers (2023-03-24T03:14:00Z) - Towards Reliable Medical Image Segmentation by utilizing Evidential Calibrated Uncertainty [52.03490691733464]
We introduce DEviS, an easily implementable foundational model that seamlessly integrates into various medical image segmentation networks.
By leveraging subjective logic theory, we explicitly model probability and uncertainty for the problem of medical image segmentation.
DeviS incorporates an uncertainty-aware filtering module, which utilizes the metric of uncertainty-calibrated error to filter reliable data.
arXiv Detail & Related papers (2023-01-01T05:02:46Z) - Predicting Patient Readmission Risk from Medical Text via Knowledge
Graph Enhanced Multiview Graph Convolution [67.72545656557858]
We propose a new method that uses medical text of Electronic Health Records for prediction.
We represent discharge summaries of patients with multiview graphs enhanced by an external knowledge graph.
Experimental results prove the effectiveness of our method, yielding state-of-the-art performance.
arXiv Detail & Related papers (2021-12-19T01:45:57Z) - Clinical Outcome Prediction from Admission Notes using Self-Supervised
Knowledge Integration [55.88616573143478]
Outcome prediction from clinical text can prevent doctors from overlooking possible risks.
Diagnoses at discharge, procedures performed, in-hospital mortality and length-of-stay prediction are four common outcome prediction targets.
We propose clinical outcome pre-training to integrate knowledge about patient outcomes from multiple public sources.
arXiv Detail & Related papers (2021-02-08T10:26:44Z) - UNITE: Uncertainty-based Health Risk Prediction Leveraging Multi-sourced
Data [81.00385374948125]
We present UNcertaInTy-based hEalth risk prediction (UNITE) model.
UNITE provides accurate disease risk prediction and uncertainty estimation leveraging multi-sourced health data.
We evaluate UNITE on real-world disease risk prediction tasks: nonalcoholic fatty liver disease (NASH) and Alzheimer's disease (AD)
UNITE achieves up to 0.841 in F1 score for AD detection, up to 0.609 in PR-AUC for NASH detection, and outperforms various state-of-the-art baselines by up to $19%$ over the best baseline.
arXiv Detail & Related papers (2020-10-22T02:28:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.