P-CAFE: Personalized Cost-Aware Incremental Feature Selection For Electronic Health Records
- URL: http://arxiv.org/abs/2508.08646v1
- Date: Tue, 12 Aug 2025 05:23:46 GMT
- Title: P-CAFE: Personalized Cost-Aware Incremental Feature Selection For Electronic Health Records
- Authors: Naama Kashani, Mira Cohen, Uri Shaham,
- Abstract summary: We propose a novel personalized, online and cost-aware feature selection framework tailored specifically for EHR datasets.<n>The framework is designed to effectively manage sparse and multimodal data, ensuring robust and scalable performance.<n>A primary application of our proposed method is to support physicians' decision making in patient screening scenarios.
- Score: 3.870455775654713
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Electronic Health Records (EHR) have revolutionized healthcare by digitizing patient data, improving accessibility, and streamlining clinical workflows. However, extracting meaningful insights from these complex and multimodal datasets remains a significant challenge for researchers. Traditional feature selection methods often struggle with the inherent sparsity and heterogeneity of EHR data, especially when accounting for patient-specific variations and feature costs in clinical applications. To address these challenges, we propose a novel personalized, online and cost-aware feature selection framework tailored specifically for EHR datasets. The features are aquired in an online fashion for individual patients, incorporating budgetary constraints and feature variability costs. The framework is designed to effectively manage sparse and multimodal data, ensuring robust and scalable performance in diverse healthcare contexts. A primary application of our proposed method is to support physicians' decision making in patient screening scenarios. By guiding physicians toward incremental acquisition of the most informative features within budget constraints, our approach aims to increase diagnostic confidence while optimizing resource utilization.
Related papers
- Integrating Genomics into Multimodal EHR Foundation Models [56.31910745104141]
This paper introduces an innovative EHR foundation model that integrates Polygenic Risk Scores (PRS) as a foundational data modality.<n>The framework aims to learn complex relationships between clinical data and genetic predispositions.<n>This approach is pivotal for unlocking new insights into disease prediction, proactive health management, risk stratification, and personalized treatment strategies.
arXiv Detail & Related papers (2025-10-24T15:56:40Z) - Extracting OPQRST in Electronic Health Records using Large Language Models with Reasoning [3.486461799078777]
This paper introduces a novel approach to extracting the OPQRST assessment from EHRs by leveraging the capabilities of Large Language Models (LLMs)<n>We propose to reframe the task from sequence labeling to text generation, enabling the models to provide reasoning steps that mimic a physician's cognitive processes.<n>Our contributions demonstrate a significant advancement in the use of AI in healthcare, offering a scalable solution that improves the accuracy and usability of information extraction from EHRs.
arXiv Detail & Related papers (2025-09-02T02:21:02Z) - MoE-Health: A Mixture of Experts Framework for Robust Multimodal Healthcare Prediction [9.073167371102386]
MoE-Health is a novel Mixture of Experts framework designed for robust multimodal fusion in healthcare prediction.<n>We evaluate MoE-Health on the MIMIC-IV dataset across three critical clinical prediction tasks: in-hospital mortality prediction, long length of stay, and hospital readmission prediction.
arXiv Detail & Related papers (2025-08-29T17:17:11Z) - Mitigating Clinician Information Overload: Generative AI for Integrated EHR and RPM Data Analysis [0.523377539745706]
We present a comprehensive overview of the capabilities, requirements and applications of Generative Artificial Intelligence (GenAI)<n>We first provide some background on the forms and sources of patient data, namely real-time Remote Patient Monitoring ( RPM) streams and traditional Electronic Health Records ( EHRs)<n>These applications can enhance navigation of longitudinal patient data and provide actionable clinical decision support through natural language dialogue.
arXiv Detail & Related papers (2025-08-26T17:10:21Z) - DeepSelective: Interpretable Prognosis Prediction via Feature Selection and Compression in EHR Data [26.378114734793492]
We propose DeepSelective, a novel end to end deep learning framework for predicting patient prognosis using EHR data.<n>DeepSelective combines data compression techniques with an innovative feature selection approach, integrating custom-designed modules.<n>Our experiments demonstrate that DeepSelective not only enhances predictive accuracy but also significantly improves interpretability, making it a valuable tool for clinical decision-making.
arXiv Detail & Related papers (2025-04-15T15:04:39Z) - Healthcare cost prediction for heterogeneous patient profiles using deep learning models with administrative claims data [0.0]
This study is grounded in socio-technical considerations that emphasize the interplay between technical systems and humanistic outcomes.<n>We propose a channel-wise deep learning framework that mitigates data heterogeneity by segmenting AC data into separate channels.<n>The proposed channel-wise models reduce prediction errors by 23% compared to single-channel models, leading to 16.4% and 19.3% reductions in overpayments and underpayments.
arXiv Detail & Related papers (2025-02-17T19:20:41Z) - Speaking the Same Language: Leveraging LLMs in Standardizing Clinical Data for AI [0.0]
This study delves into the adoption of large language models to address specific challenges, specifically, the standardization of healthcare data.
Our results illustrate that employing large language models significantly diminishes the necessity for manual data curation.
The proposed methodology has the propensity to expedite the integration of AI in healthcare, ameliorate the quality of patient care, whilst minimizing the time and financial resources necessary for the preparation of data for AI.
arXiv Detail & Related papers (2024-08-16T20:51:21Z) - Investigation Toward The Economic Feasibility of Personalized Medicine
For Healthcare Service Providers: The Case of Bladder Cancer [0.0]
We investigate the economic feasibility of implementing personalized medicine.
Unlike conventional binary approaches to personalized treatment, we propose a more nuanced perspective by treating personalization as a spectrum.
Our results show that while it is feasible to introduce personalized medicine, a highly efficient but highly expensive one would be short-lived.
arXiv Detail & Related papers (2023-08-15T17:59:46Z) - Large Language Models for Healthcare Data Augmentation: An Example on
Patient-Trial Matching [49.78442796596806]
We propose an innovative privacy-aware data augmentation approach for patient-trial matching (LLM-PTM)
Our experiments demonstrate a 7.32% average improvement in performance using the proposed LLM-PTM method, and the generalizability to new data is improved by 12.12%.
arXiv Detail & Related papers (2023-03-24T03:14:00Z) - SPeC: A Soft Prompt-Based Calibration on Performance Variability of
Large Language Model in Clinical Notes Summarization [50.01382938451978]
We introduce a model-agnostic pipeline that employs soft prompts to diminish variance while preserving the advantages of prompt-based summarization.
Experimental findings indicate that our method not only bolsters performance but also effectively curbs variance for various language models.
arXiv Detail & Related papers (2023-03-23T04:47:46Z) - Optimal discharge of patients from intensive care via a data-driven
policy learning framework [58.720142291102135]
It is important that the patient discharge task addresses the nuanced trade-off between decreasing a patient's length of stay and the risk of readmission or even death following the discharge decision.
This work introduces an end-to-end general framework for capturing this trade-off to recommend optimal discharge timing decisions.
A data-driven approach is used to derive a parsimonious, discrete state space representation that captures a patient's physiological condition.
arXiv Detail & Related papers (2021-12-17T04:39:33Z) - SANSformers: Self-Supervised Forecasting in Electronic Health Records
with Attention-Free Models [48.07469930813923]
This work aims to forecast the demand for healthcare services, by predicting the number of patient visits to healthcare facilities.
We introduce SANSformer, an attention-free sequential model designed with specific inductive biases to cater for the unique characteristics of EHR data.
Our results illuminate the promising potential of tailored attention-free models and self-supervised pretraining in refining healthcare utilization predictions across various patient demographics.
arXiv Detail & Related papers (2021-08-31T08:23:56Z) - The Medkit-Learn(ing) Environment: Medical Decision Modelling through
Simulation [81.72197368690031]
We present a new benchmarking suite designed specifically for medical sequential decision making.
The Medkit-Learn(ing) Environment is a publicly available Python package providing simple and easy access to high-fidelity synthetic medical data.
arXiv Detail & Related papers (2021-06-08T10:38:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.