Deep Learning Approach for Clinical Risk Identification Using Transformer Modeling of Heterogeneous EHR Data
- URL: http://arxiv.org/abs/2511.04158v1
- Date: Thu, 06 Nov 2025 08:02:21 GMT
- Title: Deep Learning Approach for Clinical Risk Identification Using Transformer Modeling of Heterogeneous EHR Data
- Authors: Anzhuo Xie, Wei-Chen Chang,
- Abstract summary: This study proposes a Transformer-based longitudinal modeling method to address challenges in clinical risk classification with heterogeneous EHR data.<n>The method takes multi-source medical features as input and employs a feature embedding layer to achieve a unified representation of structured and unstructured data.<n> Experimental results show that the proposed model outperforms traditional machine learning and temporal deep learning models in accuracy, recall, precision, and F1-Score.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This study proposes a Transformer-based longitudinal modeling method to address challenges in clinical risk classification with heterogeneous Electronic Health Record (EHR) data, including irregular temporal patterns, large modality differences, and complex semantic structures. The method takes multi-source medical features as input and employs a feature embedding layer to achieve a unified representation of structured and unstructured data. A learnable temporal encoding mechanism is introduced to capture dynamic evolution under uneven sampling intervals. The core model adopts a multi-head self-attention structure to perform global dependency modeling on longitudinal sequences, enabling the aggregation of long-term trends and short-term fluctuations across different temporal scales. To enhance semantic representation, a semantic-weighted pooling module is designed to assign adaptive importance to key medical events, improving the discriminative ability of risk-related features. Finally, a linear mapping layer generates individual-level risk scores. Experimental results show that the proposed model outperforms traditional machine learning and temporal deep learning models in accuracy, recall, precision, and F1-Score, achieving stable and precise risk identification in multi-source heterogeneous EHR environments and providing an efficient and reliable framework for clinical intelligent decision-making.
Related papers
- A Semantically Enhanced Generative Foundation Model Improves Pathological Image Synthesis [82.01597026329158]
We introduce a Correlation-Regulated Alignment Framework for Tissue Synthesis (CRAFTS) for pathology-specific text-to-image synthesis.<n>CRAFTS incorporates a novel alignment mechanism that suppresses semantic drift to ensure biological accuracy.<n>This model generates diverse pathological images spanning 30 cancer types, with quality rigorously validated by objective metrics and pathologist evaluations.
arXiv Detail & Related papers (2025-12-15T10:22:43Z) - Machine Learning Approaches to Clinical Risk Prediction: Multi-Scale Temporal Alignment in Electronic Health Records [2.9576397177561087]
This study proposes a risk prediction method based on a Multi-Scale Temporal Alignment Network (MSTAN)<n>It addresses the challenges of temporal irregularity, sampling interval differences, and multi-scale dynamic dependencies in Electronic Health Records (EHR)<n> Experiments conducted on publicly available EHR datasets show that the proposed model outperforms mainstream baselines in accuracy, recall, precision, and F1-Score.
arXiv Detail & Related papers (2025-11-26T16:33:59Z) - Integrating Genomics into Multimodal EHR Foundation Models [56.31910745104141]
This paper introduces an innovative EHR foundation model that integrates Polygenic Risk Scores (PRS) as a foundational data modality.<n>The framework aims to learn complex relationships between clinical data and genetic predispositions.<n>This approach is pivotal for unlocking new insights into disease prediction, proactive health management, risk stratification, and personalized treatment strategies.
arXiv Detail & Related papers (2025-10-24T15:56:40Z) - Counterfactual Probabilistic Diffusion with Expert Models [44.96279296893773]
We propose a time series diffusion-based framework that incorporates guidance from imperfect expert models.<n>Our method, ODE-Diff, bridges mechanistic and data-driven approaches, enabling more reliable and interpretable causal inference.
arXiv Detail & Related papers (2025-08-18T20:44:32Z) - Deep Survival Analysis in Multimodal Medical Data: A Parametric and Probabilistic Approach with Competing Risks [47.19194118883552]
We introduce a multimodal deep learning framework for survival analysis capable of modeling both single and competing risks scenarios.<n>We propose SAMVAE (Survival Analysis Multimodal Variational Autoencoder), a novel deep learning architecture designed for survival prediction.
arXiv Detail & Related papers (2025-07-10T14:29:48Z) - Clinical NLP with Attention-Based Deep Learning for Multi-Disease Prediction [44.0876796031468]
This paper addresses the challenges posed by the unstructured nature and high-dimensional semantic complexity of electronic health record texts.<n>A deep learning method based on attention mechanisms is proposed to achieve unified modeling for information extraction and multi-label disease prediction.
arXiv Detail & Related papers (2025-07-02T07:45:22Z) - Multi-View Contrastive Learning for Robust Domain Adaptation in Medical Time Series Analysis [6.116163085721452]
Adapting machine learning models to medical time series remains a challenge due to complex temporal dependencies and dynamic distribution shifts.<n>We propose a novel framework leveraging multi-view contrastive learning to integrate temporal patterns, derivative-based dynamics, and frequency-domain features.<n>Our method employs independent encoders and a hierarchical fusion mechanism to learn feature-invariant representations that are transferable across domains.
arXiv Detail & Related papers (2025-06-27T17:06:16Z) - High-Fidelity Scientific Simulation Surrogates via Adaptive Implicit Neural Representations [51.90920900332569]
Implicit neural representations (INRs) offer a compact and continuous framework for modeling spatially structured data.<n>Recent approaches address this by introducing additional features along rigid geometric structures.<n>We propose a simple yet effective alternative: Feature-Adaptive INR (FA-INR)
arXiv Detail & Related papers (2025-06-07T16:45:17Z) - Back to Bayesics: Uncovering Human Mobility Distributions and Anomalies with an Integrated Statistical and Neural Framework [14.899157568336731]
DeepBayesic is a novel framework that integrates Bayesian principles with deep neural networks to model the underlying distributions.
We evaluate our approach on several mobility datasets, demonstrating significant improvements over state-of-the-art anomaly detection methods.
arXiv Detail & Related papers (2024-10-01T19:02:06Z) - Differentiable Agent-based Epidemiology [71.81552021144589]
We introduce GradABM: a scalable, differentiable design for agent-based modeling that is amenable to gradient-based learning with automatic differentiation.
GradABM can quickly simulate million-size populations in few seconds on commodity hardware, integrate with deep neural networks and ingest heterogeneous data sources.
arXiv Detail & Related papers (2022-07-20T07:32:02Z) - Tree-Guided Rare Feature Selection and Logic Aggregation with Electronic
Health Records Data [7.422597776308963]
We propose a tree-guided feature selection and logic aggregation approach for large-scale regression with rare binary features.
In a suicide risk study with EHR data, our approach is able to select and aggregate prior mental health diagnoses.
arXiv Detail & Related papers (2022-06-18T03:52:43Z) - Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors.
We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method.
Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.