Related papers: Bi-Axial Transformers: Addressing the Increasing Complexity of EHR Classification

Bi-Axial Transformers: Addressing the Increasing Complexity of EHR Classification

URL: http://arxiv.org/abs/2508.12418v1
Date: Sun, 17 Aug 2025 16:04:56 GMT
Title: Bi-Axial Transformers: Addressing the Increasing Complexity of EHR Classification
Authors: Rachael DeVries, Casper Christensen, Marie Lisandra Zepeda Mendoza, Ole Winther,
Abstract summary: We present the Bi-Axial Transformer (BAT) which attends to both the clinical variable and time point axes of EHR data to learn richer data relationships.<n>BAT achieves state-of-the-art performance on sepsis prediction and is competitive to top methods for mortality classification.
Score: 6.345037597566314
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Electronic Health Records (EHRs), the digital representation of a patient's medical history, are a valuable resource for epidemiological and clinical research. They are also becoming increasingly complex, with recent trends indicating larger datasets, longer time series, and multi-modal integrations. Transformers, which have rapidly gained popularity due to their success in natural language processing and other domains, are well-suited to address these challenges due to their ability to model long-range dependencies and process data in parallel. But their application to EHR classification remains limited by data representations, which can reduce performance or fail to capture informative missingness. In this paper, we present the Bi-Axial Transformer (BAT), which attends to both the clinical variable and time point axes of EHR data to learn richer data relationships and address the difficulties of data sparsity. BAT achieves state-of-the-art performance on sepsis prediction and is competitive to top methods for mortality classification. In comparison to other transformers, BAT demonstrates increased robustness to data missingness, and learns unique sensor embeddings which can be used in transfer learning. Baseline models, which were previously located across multiple repositories or utilized deprecated libraries, were re-implemented with PyTorch and made available for reproduction and future benchmarking.

Related papers

impuTMAE: Multi-modal Transformer with Masked Pre-training for Missing Modalities Imputation in Cancer Survival Prediction [75.43342771863837]
We introduce impuTMAE, a novel transformer-based end-to-end approach with an efficient multimodal pre-training strategy.<n>It learns inter- and intra-modal interactions while simultaneously imputing missing modalities by reconstructing masked patches.<n>Our model is pre-trained on heterogeneous, incomplete data and fine-tuned for glioma survival prediction using TCGA-GBM/LGG and BraTS datasets.
arXiv Detail & Related papers (2025-08-08T10:01:16Z)
CAAT-EHR: Cross-Attentional Autoregressive Transformer for Multimodal Electronic Health Record Embeddings [0.0]
We introduce CAAT-EHR, a novel architecture designed to generate task-agnostic longitudinal embeddings from raw EHR data.<n>An autoregressive decoder complements the encoder by predicting future time points data during pre-training, ensuring that the resulting embeddings maintain temporal consistency and alignment.
arXiv Detail & Related papers (2025-01-31T05:00:02Z)
Synthesizing Multimodal Electronic Health Records via Predictive Diffusion Models [69.06149482021071]
We propose a novel EHR data generation model called EHRPD. It is a diffusion-based model designed to predict the next visit based on the current one while also incorporating time interval estimation. We conduct experiments on two public datasets and evaluate EHRPD from fidelity, privacy, and utility perspectives.
arXiv Detail & Related papers (2024-06-20T02:20:23Z)
Amplifying Pathological Detection in EEG Signaling Pathways through Cross-Dataset Transfer Learning [10.212217551908525]
We study the effectiveness of data and model scaling and cross-dataset knowledge transfer in a real-world pathology classification task. We identify the challenges of possible negative transfer and emphasize the significance of some key components. Our findings indicate a small and generic model (e.g. ShallowNet) performs well on a single dataset, however, a larger model (e.g. TCN) performs better on transfer and learning from a larger and diverse dataset.
arXiv Detail & Related papers (2023-09-19T20:09:15Z)
Differentiable Agent-based Epidemiology [71.81552021144589]
We introduce GradABM: a scalable, differentiable design for agent-based modeling that is amenable to gradient-based learning with automatic differentiation. GradABM can quickly simulate million-size populations in few seconds on commodity hardware, integrate with deep neural networks and ingest heterogeneous data sources.
arXiv Detail & Related papers (2022-07-20T07:32:02Z)
Label scarcity in biomedicine: Data-rich latent factor discovery enhances phenotype prediction [102.23901690661916]
Low-dimensional embedding spaces can be derived from the UK Biobank population dataset to enhance data-scarce prediction of health indicators, lifestyle and demographic characteristics. Performances gains from semisupervison approaches will probably become an important ingredient for various medical data science applications.
arXiv Detail & Related papers (2021-10-12T16:25:50Z)
Sequential Diagnosis Prediction with Transformer and Ontological Representation [35.88195694025553]
We propose an end-to-end robust transformer-based model called SETOR to handle irregular intervals between a patient's visits with admitted timestamps and length of stay in each visit. Experiments conducted on two real-world healthcare datasets show that, our sequential diagnoses prediction model SETOR achieves better predictive results than previous state-of-the-art approaches.
arXiv Detail & Related papers (2021-09-07T13:09:55Z)
Transformer Networks for Data Augmentation of Human Physical Activity Recognition [61.303828551910634]
State of the art models like Recurrent Generative Adrial Networks (RGAN) are used to generate realistic synthetic data. In this paper, transformer based generative adversarial networks which have global attention on data, are compared on PAMAP2 and Real World Human Activity Recognition data sets with RGAN.
arXiv Detail & Related papers (2021-09-02T16:47:29Z)
Categorical EHR Imputation with Generative Adversarial Nets [11.171712535005357]
We propose a simple and yet effective approach that is based on previous work on GANs for data imputation. We show that our imputation approach largely improves the prediction accuracy, compared to more traditional data imputation approaches.
arXiv Detail & Related papers (2021-08-03T18:50:26Z)
Bootstrapping Your Own Positive Sample: Contrastive Learning With Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model. We introduce two unique positive sampling strategies specifically tailored for EHR data. Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z)
Handling Non-ignorably Missing Features in Electronic Health Records Data Using Importance-Weighted Autoencoders [8.518166245293703]
We propose a novel extension of VAEs called Importance-Weighted Autoencoders (IWAEs) to flexibly handle Missing Not At Random patterns in the Physionet data. Our proposed method models the missingness mechanism using an embedded neural network, eliminating the need to specify the exact form of the missingness mechanism a priori.
arXiv Detail & Related papers (2021-01-18T22:53:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.