Related papers: Healthcare cost prediction for heterogeneous patient profiles using deep learning models with administrative claims data

Healthcare cost prediction for heterogeneous patient profiles using deep learning models with administrative claims data

URL: http://arxiv.org/abs/2502.12277v1
Date: Mon, 17 Feb 2025 19:20:41 GMT
Title: Healthcare cost prediction for heterogeneous patient profiles using deep learning models with administrative claims data
Authors: Mohammad Amin Morid, Olivia R. Liu Sheng,
Abstract summary: This study is grounded in socio-technical considerations that emphasize the interplay between technical systems and humanistic outcomes.<n>We propose a channel-wise deep learning framework that mitigates data heterogeneity by segmenting AC data into separate channels.<n>The proposed channel-wise models reduce prediction errors by 23% compared to single-channel models, leading to 16.4% and 19.3% reductions in overpayments and underpayments.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Problem: How can we design patient cost prediction models that effectively address the challenges of heterogeneity in administrative claims (AC) data to ensure accurate, fair, and generalizable predictions, especially for high-need (HN) patients with complex chronic conditions? Relevance: Accurate and equitable patient cost predictions are vital for developing health management policies and optimizing resource allocation, which can lead to significant cost savings for healthcare payers, including government agencies and private insurers. Addressing disparities in prediction outcomes for HN patients ensures better economic and clinical decision-making, benefiting both patients and payers. Methodology: This study is grounded in socio-technical considerations that emphasize the interplay between technical systems (e.g., deep learning models) and humanistic outcomes (e.g., fairness in healthcare decisions). It incorporates representation learning and entropy measurement to address heterogeneity and complexity in data and patient profiles, particularly for HN patients. We propose a channel-wise deep learning framework that mitigates data heterogeneity by segmenting AC data into separate channels based on types of codes (e.g., diagnosis, procedures) and costs. This approach is paired with a flexible evaluation design that uses multi-channel entropy measurement to assess patient heterogeneity. Results: The proposed channel-wise models reduce prediction errors by 23% compared to single-channel models, leading to 16.4% and 19.3% reductions in overpayments and underpayments, respectively. Notably, the reduction in prediction bias is significantly higher for HN patients, demonstrating effectiveness in handling heterogeneity and complexity in data and patient profiles. This demonstrates the potential for applying channel-wise modeling to domains with similar heterogeneity challenges.

Related papers

Semi-supervised Clustering Through Representation Learning of Large-scale EHR Data [5.591260685112265]
SCORE is a semi-supervised representation learning framework that captures multi-domain disease profiles through patient embeddings.<n>To handle the computational challenges of large-scale data, it introduces a hybrid Expectation-Maximization (EM) and Gaussian Variational Approximation (GVA) algorithm.<n>Our analysis shows that incorporating unlabeled data enhances accuracy and reduces sensitivity to label scarcity.
arXiv Detail & Related papers (2025-05-27T05:20:17Z)
Predictive and Prescriptive Analytics for Multi-Site Modeling of Frail and Elderly Patient Services [0.0]
The aim of this research is to assess how various predictive and prescriptive analytical methods contribute to addressing the operational challenges within an area of healthcare that is growing in demand.<n>On the prescriptive side, deterministic and two-stage programs are developed to determine how to optimally plan for beds and ward staff.<n>Our research reveals that healthcare managers should consider using predictive and prescriptive models to make more informed decisions.
arXiv Detail & Related papers (2023-11-13T12:25:45Z)
MedDiffusion: Boosting Health Risk Prediction via Diffusion-based Data Augmentation [58.93221876843639]
This paper introduces a novel, end-to-end diffusion-based risk prediction model, named MedDiffusion. It enhances risk prediction performance by creating synthetic patient data during training to enlarge sample space. It discerns hidden relationships between patient visits using a step-wise attention mechanism, enabling the model to automatically retain the most vital information for generating high-quality data.
arXiv Detail & Related papers (2023-10-04T01:36:30Z)
Hypergraph Convolutional Networks for Fine-grained ICU Patient Similarity Analysis and Risk Prediction [15.06049250330114]
The Intensive Care Unit (ICU) is one of the most important parts of a hospital, which admits critically ill patients and provides continuous monitoring and treatment. Various patient outcome prediction methods have been attempted to assist healthcare professionals in clinical decision-making.
arXiv Detail & Related papers (2023-08-24T05:26:56Z)
TREEMENT: Interpretable Patient-Trial Matching via Personalized Dynamic Tree-Based Memory Network [54.332862955411656]
Clinical trials are critical for drug development but often suffer from expensive and inefficient patient recruitment. In recent years, machine learning models have been proposed for speeding up patient recruitment via automatically matching patients with clinical trials. We introduce a dynamic tree-based memory network model named TREEMENT to provide accurate and interpretable patient trial matching.
arXiv Detail & Related papers (2023-07-19T12:35:09Z)
Building predictive models of healthcare costs with open healthcare data [0.0]
We present an approach to developing a predictive model using machine-learning techniques. We analyzed de-identified patient data from New York StateS, consisting of 2.3 million records in 2016. We built models to predict costs from patient diagnoses and demographics.
arXiv Detail & Related papers (2023-04-05T02:12:58Z)
Large Language Models for Healthcare Data Augmentation: An Example on Patient-Trial Matching [49.78442796596806]
We propose an innovative privacy-aware data augmentation approach for patient-trial matching (LLM-PTM) Our experiments demonstrate a 7.32% average improvement in performance using the proposed LLM-PTM method, and the generalizability to new data is improved by 12.12%.
arXiv Detail & Related papers (2023-03-24T03:14:00Z)
Predicting Visit Cost of Obstructive Sleep Apnea using Electronic Healthcare Records with Transformer [0.0]
Obstructive sleep apnea (OSA) is growing increasingly prevalent in many countries as obesity rises. For treatment purposes, predicting OSA patients' visit expenses for the coming year is crucial. Just a third of those data from OSA patients can be used to train analytics models.
arXiv Detail & Related papers (2023-01-28T20:08:00Z)
SANSformers: Self-Supervised Forecasting in Electronic Health Records with Attention-Free Models [48.07469930813923]
This work aims to forecast the demand for healthcare services, by predicting the number of patient visits to healthcare facilities. We introduce SANSformer, an attention-free sequential model designed with specific inductive biases to cater for the unique characteristics of EHR data. Our results illuminate the promising potential of tailored attention-free models and self-supervised pretraining in refining healthcare utilization predictions across various patient demographics.
arXiv Detail & Related papers (2021-08-31T08:23:56Z)
Bootstrapping Your Own Positive Sample: Contrastive Learning With Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model. We introduce two unique positive sampling strategies specifically tailored for EHR data. Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z)
Clinical Outcome Prediction from Admission Notes using Self-Supervised Knowledge Integration [55.88616573143478]
Outcome prediction from clinical text can prevent doctors from overlooking possible risks. Diagnoses at discharge, procedures performed, in-hospital mortality and length-of-stay prediction are four common outcome prediction targets. We propose clinical outcome pre-training to integrate knowledge about patient outcomes from multiple public sources.
arXiv Detail & Related papers (2021-02-08T10:26:44Z)
Improving healthcare access management by predicting patient no-show behaviour [0.0]
This work develops a Decision Support System (DSS) to support the implementation of strategies to encourage attendance. We assess the effectiveness of different machine learning approaches to improve the accuracy of regression models. In addition to quantifying relationships reported in previous studies, we find that income and neighbourhood crime statistics affect no-show probabilities.
arXiv Detail & Related papers (2020-12-10T14:57:25Z)
UNITE: Uncertainty-based Health Risk Prediction Leveraging Multi-sourced Data [81.00385374948125]
We present UNcertaInTy-based hEalth risk prediction (UNITE) model. UNITE provides accurate disease risk prediction and uncertainty estimation leveraging multi-sourced health data. We evaluate UNITE on real-world disease risk prediction tasks: nonalcoholic fatty liver disease (NASH) and Alzheimer's disease (AD) UNITE achieves up to 0.841 in F1 score for AD detection, up to 0.609 in PR-AUC for NASH detection, and outperforms various state-of-the-art baselines by up to $19%$ over the best baseline.
arXiv Detail & Related papers (2020-10-22T02:28:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.