Bidirectional Generative Pre-training for Improving Healthcare Time-series Representation Learning
- URL: http://arxiv.org/abs/2402.09558v3
- Date: Fri, 23 Aug 2024 18:25:37 GMT
- Title: Bidirectional Generative Pre-training for Improving Healthcare Time-series Representation Learning
- Authors: Ziyang Song, Qincheng Lu, He Zhu, David Buckeridge, Yue Li,
- Abstract summary: We propose a novel architecture called BiTimely Generative Pre-trained Transformer (BiTimelyGPT)
BiTimelyGPT pre-trains on biosignals and longitudinal clinical records by both next-token and previous-token prediction in alternating transformer layers.
Using biosignals and longitudinal clinical records, BiTimelyGPT demonstrates superior performance in predicting neurological functionality, disease diagnosis, and physiological signs.
- Score: 9.621781933666844
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning time-series representations for discriminative tasks, such as classification and regression, has been a long-standing challenge in the healthcare domain. Current pre-training methods are limited in either unidirectional next-token prediction or randomly masked token prediction. We propose a novel architecture called Bidirectional Timely Generative Pre-trained Transformer (BiTimelyGPT), which pre-trains on biosignals and longitudinal clinical records by both next-token and previous-token prediction in alternating transformer layers. This pre-training task preserves original distribution and data shapes of the time-series. Additionally, the full-rank forward and backward attention matrices exhibit more expressive representation capabilities. Using biosignals and longitudinal clinical records, BiTimelyGPT demonstrates superior performance in predicting neurological functionality, disease diagnosis, and physiological signs. By visualizing the attention heatmap, we observe that the pre-trained BiTimelyGPT can identify discriminative segments from biosignal time-series sequences, even more so after fine-tuning on the task.
Related papers
- L-MAE: Longitudinal masked auto-encoder with time and severity-aware encoding for diabetic retinopathy progression prediction [2.663690023739801]
Pre-training strategies based on self-supervised learning (SSL) have proven to be effective pretext tasks for many downstream tasks in computer vision.
We developed a longitudinal masked auto-encoder (MAE) based on the well-known Transformer-based MAE.
Using OPHDIAT, a large follow-up screening dataset targeting diabetic retinopathy (DR), we evaluated the pre-trained weights on a longitudinal task.
arXiv Detail & Related papers (2024-03-24T19:34:33Z) - Temporal Cross-Attention for Dynamic Embedding and Tokenization of Multimodal Electronic Health Records [1.6609516435725236]
We introduce a dynamic embedding and tokenization framework for precise representation of multimodal clinical time series.
Our framework outperformed baseline approaches on the task of predicting the occurrence of nine postoperative complications.
arXiv Detail & Related papers (2024-03-06T19:46:44Z) - TimeSiam: A Pre-Training Framework for Siamese Time-Series Modeling [67.02157180089573]
Time series pre-training has recently garnered wide attention for its potential to reduce labeling expenses and benefit various downstream tasks.
This paper proposes TimeSiam as a simple but effective self-supervised pre-training framework for Time series based on Siamese networks.
arXiv Detail & Related papers (2024-02-04T13:10:51Z) - TimelyGPT: Extrapolatable Transformer Pre-training for Long-term Time-Series Forecasting in Healthcare [14.14872125241069]
We present Timely Generative Pre-trained Transformer (TimelyGPT)
TimelyGPT employs an extrapolatable position (xPos) embedding to encode trend and periodic patterns into time-series representations.
It also integrates recurrent attention and temporal convolution modules to effectively capture global-local temporal dependencies.
arXiv Detail & Related papers (2023-11-29T19:09:28Z) - Time Associated Meta Learning for Clinical Prediction [78.99422473394029]
We propose a novel time associated meta learning (TAML) method to make effective predictions at multiple future time points.
To address the sparsity problem after task splitting, TAML employs a temporal information sharing strategy to augment the number of positive samples.
We demonstrate the effectiveness of TAML on multiple clinical datasets, where it consistently outperforms a range of strong baselines.
arXiv Detail & Related papers (2023-03-05T03:54:54Z) - T-Phenotype: Discovering Phenotypes of Predictive Temporal Patterns in
Disease Progression [82.85825388788567]
We develop a novel temporal clustering method, T-Phenotype, to discover phenotypes of predictive temporal patterns from labeled time-series data.
We show that T-Phenotype achieves the best phenotype discovery performance over all the evaluated baselines.
arXiv Detail & Related papers (2023-02-24T13:30:35Z) - Prior Knowledge-Guided Attention in Self-Supervised Vision Transformers [79.60022233109397]
We present spatial prior attention (SPAN), a framework that takes advantage of consistent spatial and semantic structure in unlabeled image datasets.
SPAN operates by regularizing attention masks from separate transformer heads to follow various priors over semantic regions.
We find that the resulting attention masks are more interpretable than those derived from domain-agnostic pretraining.
arXiv Detail & Related papers (2022-09-07T02:30:36Z) - CEHR-BERT: Incorporating temporal information from structured EHR data
to improve prediction tasks [0.0]
We develop a new BERT adaptation, CEHR-BERT, to incorporate temporal information using a hybrid approach.
CEHR-BERT was trained on a subset of Columbia University Irving Medical Center-York Presbyterian Hospital's clinical data.
arXiv Detail & Related papers (2021-11-10T16:53:32Z) - STELAR: Spatio-temporal Tensor Factorization with Latent Epidemiological
Regularization [76.57716281104938]
We develop a tensor method to predict the evolution of epidemic trends for many regions simultaneously.
STELAR enables long-term prediction by incorporating latent temporal regularization through a system of discrete-time difference equations.
We conduct experiments using both county- and state-level COVID-19 data and show that our model can identify interesting latent patterns of the epidemic.
arXiv Detail & Related papers (2020-12-08T21:21:47Z) - BiteNet: Bidirectional Temporal Encoder Network to Predict Medical
Outcomes [53.163089893876645]
We propose a novel self-attention mechanism that captures the contextual dependency and temporal relationships within a patient's healthcare journey.
An end-to-end bidirectional temporal encoder network (BiteNet) then learns representations of the patient's journeys.
We have evaluated the effectiveness of our methods on two supervised prediction and two unsupervised clustering tasks with a real-world EHR dataset.
arXiv Detail & Related papers (2020-09-24T00:42:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.