Transformer representation learning is necessary for dynamic multi-modal physiological data on small-cohort patients
- URL: http://arxiv.org/abs/2504.04120v2
- Date: Fri, 11 Apr 2025 03:05:17 GMT
- Title: Transformer representation learning is necessary for dynamic multi-modal physiological data on small-cohort patients
- Authors: Bingxu Wang, Yapeng Wang, Kunzhi Cai, Yuqi Zhang, Zeyi Zhou, Yachong Guo, Wei Wang, Qing Zhou,
- Abstract summary: Postoperative delirium (POD) is a severe neuropsychiatric complication affecting nearly 50% of high-risk surgical patients.<n>We propose a POD prediction framework comprising a Transformer representation model followed by traditional machine learning algorithms.
- Score: 13.200153129546983
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Postoperative delirium (POD), a severe neuropsychiatric complication affecting nearly 50% of high-risk surgical patients, is defined as an acute disorder of attention and cognition, It remains significantly underdiagnosed in the intensive care units (ICUs) due to subjective monitoring methods. Early and accurate diagnosis of POD is critical and achievable. Here, we propose a POD prediction framework comprising a Transformer representation model followed by traditional machine learning algorithms. Our approaches utilizes multi-modal physiological data, including amplitude-integrated electroencephalography (aEEG), vital signs, electrocardiographic monitor data as well as hemodynamic parameters. We curated the first multi-modal POD dataset encompassing two patient types and evaluated the various Transformer architectures for representation learning. Empirical results indicate a consistent improvements of sensitivity and Youden index in patient TYPE I using Transformer representations, particularly our fusion adaptation of Pathformer. By enabling effective delirium diagnosis from postoperative day 1 to 3, our extensive experimental findings emphasize the potential of multi-modal physiological data and highlight the necessity of representation learning via multi-modal Transformer architecture in clinical diagnosis.
Related papers
- Continually Evolved Multimodal Foundation Models for Cancer Prognosis [50.43145292874533]
Cancer prognosis is a critical task that involves predicting patient outcomes and survival rates.<n>Previous studies have integrated diverse data modalities, such as clinical notes, medical images, and genomic data, leveraging their complementary information.<n>Existing approaches face two major limitations. First, they struggle to incorporate newly arrived data with varying distributions into training, such as patient records from different hospitals.<n>Second, most multimodal integration methods rely on simplistic concatenation or task-specific pipelines, which fail to capture the complex interdependencies across modalities.
arXiv Detail & Related papers (2025-01-30T06:49:57Z) - RISE-iEEG: Robust to Inter-Subject Electrodes Implantation Variability iEEG Classifier [0.0]
RISE-iEEG stands for Robust Inter-Subject Electrode Implantation Variability iEEG.
We developed an iEEG decoder model that can be applied across multiple patients' data without requiring the coordinates of electrode for each patient.
Our analysis shows that the performance of RISE-iEEG is 10% higher than that of HTNet and EEGNet in terms of F1 score.
arXiv Detail & Related papers (2024-08-12T18:33:19Z) - Potential of Multimodal Large Language Models for Data Mining of Medical Images and Free-text Reports [51.45762396192655]
Multimodal large language models (MLLMs) have recently transformed many domains, significantly affecting the medical field. Notably, Gemini-Vision-series (Gemini) and GPT-4-series (GPT-4) models have epitomized a paradigm shift in Artificial General Intelligence for computer vision.
This study evaluated the performance of the Gemini, GPT-4, and 4 popular large models for an exhaustive evaluation across 14 medical imaging datasets.
arXiv Detail & Related papers (2024-07-08T09:08:42Z) - MPRE: Multi-perspective Patient Representation Extractor for Disease
Prediction [3.914545513460964]
We propose the Multi-perspective Patient Representation Extractor (MPRE) for disease prediction.
Specifically, we propose Frequency Transformation Module (FTM) to extract the trend and variation information of dynamic features.
In the 2D Multi-Extraction Network (2D MEN), we form the 2D temporal tensor based on trend and variation.
We also propose the First-Order Difference Attention Mechanism (FODAM) to calculate the contributions of differences in adjacent variations to the disease diagnosis.
arXiv Detail & Related papers (2024-01-01T13:52:05Z) - XAI for In-hospital Mortality Prediction via Multimodal ICU Data [57.73357047856416]
We propose an efficient, explainable AI solution for predicting in-hospital mortality via multimodal ICU data.
We employ multimodal learning in our framework, which can receive heterogeneous inputs from clinical data and make decisions.
Our framework can be easily transferred to other clinical tasks, which facilitates the discovery of crucial factors in healthcare research.
arXiv Detail & Related papers (2023-12-29T14:28:04Z) - Radiology Report Generation Using Transformers Conditioned with
Non-imaging Data [55.17268696112258]
This paper proposes a novel multi-modal transformer network that integrates chest x-ray (CXR) images and associated patient demographic information.
The proposed network uses a convolutional neural network to extract visual features from CXRs and a transformer-based encoder-decoder network that combines the visual features with semantic text embeddings of patient demographic information.
arXiv Detail & Related papers (2023-11-18T14:52:26Z) - fMRI-PTE: A Large-scale fMRI Pretrained Transformer Encoder for
Multi-Subject Brain Activity Decoding [54.17776744076334]
We propose fMRI-PTE, an innovative auto-encoder approach for fMRI pre-training.
Our approach involves transforming fMRI signals into unified 2D representations, ensuring consistency in dimensions and preserving brain activity patterns.
Our contributions encompass introducing fMRI-PTE, innovative data transformation, efficient training, a novel learning strategy, and the universal applicability of our approach.
arXiv Detail & Related papers (2023-11-01T07:24:22Z) - A Transformer-based representation-learning model with unified
processing of multimodal input for clinical diagnostics [63.106382317917344]
We report a Transformer-based representation-learning model as a clinical diagnostic aid that processes multimodal input in a unified manner.
The unified model outperformed an image-only model and non-unified multimodal diagnosis models in the identification of pulmonary diseases.
arXiv Detail & Related papers (2023-06-01T16:23:47Z) - MVMTnet: A Multi-variate Multi-modal Transformer for Multi-class
Classification of Cardiac Irregularities Using ECG Waveforms and Clinical
Notes [4.648677931378919]
Deep learning can be used to optimize diagnosis and patient monitoring for clinical-based applications.
For cardiovascular disease, one such condition where the rising number of patients increasingly outweighs the availability of medical resources in different parts of the world, a core challenge is the automated classification of various cardiac abnormalities.
The proposed novel multi-modal Transformer architecture would be able to accurately perform this task while demonstrating the cross-domain effectiveness of Transformers.
arXiv Detail & Related papers (2023-02-21T21:38:41Z) - EEG-Based Epileptic Seizure Prediction Using Temporal Multi-Channel
Transformers [1.0970480513577103]
Epilepsy is one of the most common neurological diseases, characterized by transient and unprovoked events called epileptic seizures.
EEG is an auxiliary method used to perform both the diagnosis and the monitoring of epilepsy.
Given the unexpected nature of an epileptic seizure, its prediction would improve patient care, optimizing the quality of life and the treatment of epilepsy.
arXiv Detail & Related papers (2022-09-18T03:03:47Z) - A Novel TSK Fuzzy System Incorporating Multi-view Collaborative Transfer
Learning for Personalized Epileptic EEG Detection [20.11589208667256]
We propose a TSK fuzzy system-based epilepsy detection algorithm that integrates multi-view collaborative transfer learning.
The proposed method has the potential to detect epileptic EEG signals effectively.
arXiv Detail & Related papers (2021-11-11T12:15:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.