A Vector-Quantized Foundation Model for Patient Behavior Monitoring
- URL: http://arxiv.org/abs/2503.15221v2
- Date: Mon, 14 Jul 2025 08:43:18 GMT
- Title: A Vector-Quantized Foundation Model for Patient Behavior Monitoring
- Authors: Rodrigo Oliver, Josué Pérez-Sabater, Leire Paz-Arbaizar, Diego Herrero-Quevedo, Antonio Artés-Rodríguez, Alejandro Lancho, Pablo M. Olmos,
- Abstract summary: This paper introduces a novel foundation model based on a modified vector quantized variational autoencoder, specifically designed to process real-world data from smartphones and wearable devices.<n>We leveraged the discrete latent representation of this model to effectively perform two downstream tasks, suicide risk assessment and emotional state prediction, on different held-out clinical cohorts without the need of fine-tuning.
- Score: 43.02353546717171
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Foundation models have achieved remarkable success across various domains, yet their adoption in healthcare remains limited. While significant advances have been made in medical imaging, genetic biomarkers, and time series from electronic health records, the potential of foundation models for patient behavior monitoring through personal digital devices remains underexplored. The data generated by these devices are inherently heterogeneous, multisource, and often exhibit high rates of missing data, posing unique challenges. This paper introduces a novel foundation model based on a modified vector quantized variational autoencoder, specifically designed to process real-world data from smartphones and wearable devices. We leveraged the discrete latent representation of this model to effectively perform two downstream tasks, suicide risk assessment and emotional state prediction, on different held-out clinical cohorts without the need of fine-tuning. We also highlight the existence of a trade-off between discrete and continuous latent structures, suggesting that hybrid models may be optimal for balancing accuracy across various supervised and unsupervised tasks.
Related papers
- Continually Evolved Multimodal Foundation Models for Cancer Prognosis [50.43145292874533]
Cancer prognosis is a critical task that involves predicting patient outcomes and survival rates.<n>Previous studies have integrated diverse data modalities, such as clinical notes, medical images, and genomic data, leveraging their complementary information.<n>Existing approaches face two major limitations. First, they struggle to incorporate newly arrived data with varying distributions into training, such as patient records from different hospitals.<n>Second, most multimodal integration methods rely on simplistic concatenation or task-specific pipelines, which fail to capture the complex interdependencies across modalities.
arXiv Detail & Related papers (2025-01-30T06:49:57Z) - LV-CadeNet: Long View Feature Convolution-Attention Fusion Encoder-Decoder Network for Clinical MEG Spike Detection [5.140340328388902]
We introduce LV-CadeNet, designed for automatic MEG spike detection in real-world clinical scenarios.<n>Our approach also mimics human specialists by constructing long view morphological input data.<n> LV-CadeNet significantly improves the accuracy of MEG spike detection, boosting it from 42.31% to 54.88% on a novel clinical dataset.
arXiv Detail & Related papers (2024-12-12T03:19:44Z) - Bed-Attached Vibration Sensor System: A Machine Learning Approach for Fall Detection in Nursing Homes [33.45861095003339]
This study presents the development of an automated fall detection system integrated into care beds, aimed at enhancing patient safety without compromising privacy through wearables or video monitoring.<n>Mechanical vibrations transmitted through the bed frame are processed using a short-time Fourier transform, enabling robust classification of distinct human fall patterns with a convolutional neural network.<n>Despite limited available data, the proposed system shows the potential for an accurate and rapid response to falls, mitigating health implications, and addressing the needs of an aging population.
arXiv Detail & Related papers (2024-12-06T11:08:47Z) - Back to Bayesics: Uncovering Human Mobility Distributions and Anomalies with an Integrated Statistical and Neural Framework [14.899157568336731]
DeepBayesic is a novel framework that integrates Bayesian principles with deep neural networks to model the underlying distributions.
We evaluate our approach on several mobility datasets, demonstrating significant improvements over state-of-the-art anomaly detection methods.
arXiv Detail & Related papers (2024-10-01T19:02:06Z) - Synthesizing Multimodal Electronic Health Records via Predictive Diffusion Models [69.06149482021071]
We propose a novel EHR data generation model called EHRPD.
It is a diffusion-based model designed to predict the next visit based on the current one while also incorporating time interval estimation.
We conduct experiments on two public datasets and evaluate EHRPD from fidelity, privacy, and utility perspectives.
arXiv Detail & Related papers (2024-06-20T02:20:23Z) - A Reliable Framework for Human-in-the-Loop Anomaly Detection in Time Series [17.08674819906415]
We introduce HILAD, a novel framework designed to foster a dynamic and bidirectional collaboration between humans and AI.<n>Through our visual interface, HILAD empowers domain experts to detect, interpret, and correct unexpected model behaviors at scale.
arXiv Detail & Related papers (2024-05-06T07:44:07Z) - IGNITE: Individualized GeNeration of Imputations in Time-series Electronic health records [6.630372114304835]
We propose a novel deep-learning model that learns the underlying patient dynamics to generate personalized values conditioning on an individual's demographic characteristics and treatments.<n>Our proposed model, IGNITE, utilise a conditional dual-variational autoencoder augmented with dual-stage attention to generate missing values for an individual.<n>We show that IGNITE outperforms state-of-the-art approaches in missing data reconstruction and task prediction.
arXiv Detail & Related papers (2024-01-09T07:57:21Z) - MPRE: Multi-perspective Patient Representation Extractor for Disease
Prediction [3.914545513460964]
We propose the Multi-perspective Patient Representation Extractor (MPRE) for disease prediction.
Specifically, we propose Frequency Transformation Module (FTM) to extract the trend and variation information of dynamic features.
In the 2D Multi-Extraction Network (2D MEN), we form the 2D temporal tensor based on trend and variation.
We also propose the First-Order Difference Attention Mechanism (FODAM) to calculate the contributions of differences in adjacent variations to the disease diagnosis.
arXiv Detail & Related papers (2024-01-01T13:52:05Z) - MedDiffusion: Boosting Health Risk Prediction via Diffusion-based Data
Augmentation [58.93221876843639]
This paper introduces a novel, end-to-end diffusion-based risk prediction model, named MedDiffusion.
It enhances risk prediction performance by creating synthetic patient data during training to enlarge sample space.
It discerns hidden relationships between patient visits using a step-wise attention mechanism, enabling the model to automatically retain the most vital information for generating high-quality data.
arXiv Detail & Related papers (2023-10-04T01:36:30Z) - RARE: Robust Masked Graph Autoencoder [45.485891794905946]
Masked graph autoencoder (MGAE) has emerged as a promising self-supervised graph pre-training (SGP) paradigm.
We propose a novel SGP method termed Robust mAsked gRaph autoEncoder (RARE) to improve the certainty in inferring masked data.
arXiv Detail & Related papers (2023-04-04T03:35:29Z) - Safe AI for health and beyond -- Monitoring to transform a health
service [51.8524501805308]
We will assess the infrastructure required to monitor the outputs of a machine learning algorithm.
We will present two scenarios with examples of monitoring and updates of models.
arXiv Detail & Related papers (2023-03-02T17:27:45Z) - Unsupervised Pre-Training on Patient Population Graphs for Patient-Level
Predictions [48.02011627390706]
Pre-training has shown success in different areas of machine learning, such as Computer Vision (CV), Natural Language Processing (NLP) and medical imaging.
In this paper, we apply unsupervised pre-training to heterogeneous, multi-modal EHR data for patient outcome prediction.
We find that our proposed graph based pre-training method helps in modeling the data at a population level.
arXiv Detail & Related papers (2022-03-23T17:59:45Z) - In-Bed Human Pose Estimation from Unseen and Privacy-Preserving Image
Domains [22.92165116962952]
In-bed human posture estimation provides important health-related metrics with potential value in medical condition assessments.
We propose a multi-modal conditional variational autoencoder (MC-VAE) capable of reconstructing features from missing modalities seen during training.
We demonstrate that body positions can be effectively recognized from the available modality, achieving on par results with baseline models.
arXiv Detail & Related papers (2021-11-30T04:56:16Z) - SANSformers: Self-Supervised Forecasting in Electronic Health Records
with Attention-Free Models [48.07469930813923]
This work aims to forecast the demand for healthcare services, by predicting the number of patient visits to healthcare facilities.
We introduce SANSformer, an attention-free sequential model designed with specific inductive biases to cater for the unique characteristics of EHR data.
Our results illuminate the promising potential of tailored attention-free models and self-supervised pretraining in refining healthcare utilization predictions across various patient demographics.
arXiv Detail & Related papers (2021-08-31T08:23:56Z) - Multiple Organ Failure Prediction with Classifier-Guided Generative
Adversarial Imputation Networks [4.040013871160853]
Multiple organ failure (MOF) is a severe syndrome with a high mortality rate among Intensive Care Unit (ICU) patients.
Applying machine learning models to electronic health records is a challenge due to the pervasiveness of missing values.
arXiv Detail & Related papers (2021-06-22T15:49:01Z) - Many-to-One Distribution Learning and K-Nearest Neighbor Smoothing for
Thoracic Disease Identification [83.6017225363714]
deep learning has become the most powerful computer-aided diagnosis technology for improving disease identification performance.
For chest X-ray imaging, annotating large-scale data requires professional domain knowledge and is time-consuming.
In this paper, we propose many-to-one distribution learning (MODL) and K-nearest neighbor smoothing (KNNS) methods to improve a single model's disease identification performance.
arXiv Detail & Related papers (2021-02-26T02:29:30Z) - Real-time Prediction for Mechanical Ventilation in COVID-19 Patients
using A Multi-task Gaussian Process Multi-objective Self-attention Network [9.287068570192057]
We propose a robust in-time predictor for in-hospital COVID-19 patient's probability of requiring mechanical ventilation.
A challenge in the risk prediction for COVID-19 patients lies in the great variability and irregular sampling of patient's vitals and labs observed in the clinical setting.
We frame the prediction task into a multi-objective learning framework, and the risk scores at all time points are optimized altogether.
arXiv Detail & Related papers (2021-02-01T20:35:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.