Related papers: PaPaGei: Open Foundation Models for Optical Physiological Signals

PaPaGei: Open Foundation Models for Optical Physiological Signals

URL: http://arxiv.org/abs/2410.20542v2
Date: Wed, 05 Feb 2025 16:14:27 GMT
Title: PaPaGei: Open Foundation Models for Optical Physiological Signals
Authors: Arvind Pillai, Dimitris Spathis, Fahim Kawsar, Mohammad Malekzadeh,
Abstract summary: Photoplethysmography is the leading non-invasive technique for monitoring biosignals and cardiovascular health.<n>Machine learning models trained on PPG signals tend to be task-specific and struggle with generalization.<n>We present PaPaGei, the first open foundation model for PPG signals.
Score: 8.78925327256804
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Photoplethysmography (PPG) is the leading non-invasive technique for monitoring biosignals and cardiovascular health, with widespread adoption in both clinical settings and consumer wearable devices. While machine learning models trained on PPG signals have shown promise, they tend to be task-specific and struggle with generalization. Current research is limited by the use of single-device datasets, insufficient exploration of out-of-domain generalization, and a lack of publicly available models, which hampers reproducibility. To address these limitations, we present PaPaGei, the first open foundation model for PPG signals. The model is pre-trained on over 57,000 hours of data, comprising 20 million unlabeled PPG segments from publicly available datasets. We introduce a novel representation learning approach that leverages domain knowledge of PPG signal morphology across individuals, enabling the capture of richer representations compared to traditional contrastive learning methods. We evaluate PaPaGei against state-of-the-art time-series foundation models and self-supervised learning benchmarks across 20 tasks from 10 diverse datasets, spanning cardiovascular health, sleep disorders, pregnancy monitoring, and wellbeing assessment. Our model demonstrates superior performance, improving classification and regression metrics by 6.3% and 2.9% respectively in at least 14 tasks. Notably, PaPaGei achieves these results while being more data- and parameter-efficient, outperforming models that are 70x larger. Beyond accuracy, we examine model robustness across different skin tones, establishing a benchmark for bias evaluation in future models. PaPaGei can serve as both a feature extractor and an encoder for multimodal models, opening up new opportunities for multimodal health monitoring.

Related papers

Patient-specific vs Multi-Patient Vision Transformer for Markerless Tumor Motion Forecasting [0.0]
This work introduces a markerless forecasting approach for lung tumor motion using Vision Transformers (ViT)<n>Two training strategies are evaluated under clinically realistic constraints: a patient-specific (PS) approach that learns individualized motion patterns, and a multi-patient (MP) model designed for generalization.
arXiv Detail & Related papers (2025-07-10T14:40:52Z)
WorldPM: Scaling Human Preference Modeling [130.23230492612214]
We propose World Preference Modeling$ (WorldPM) to emphasize this scaling potential.<n>We collect preference data from public forums covering diverse user communities.<n>We conduct extensive training using 15M-scale data across models ranging from 1.5B to 72B parameters.
arXiv Detail & Related papers (2025-05-15T17:38:37Z)
AI-Assisted Colonoscopy: Polyp Detection and Segmentation using Foundation Models [0.10037949839020764]
In colonoscopy, 80% of the missed polyps could be detected with the help of Deep Learning models. In the search for algorithms capable of addressing this challenge, foundation models emerge as promising candidates. Their zero-shot or few-shot learning capabilities, facilitate generalization to new data or tasks without extensive fine-tuning. A comprehensive evaluation of foundation models for polyp segmentation was conducted, assessing both detection and delimitation.
arXiv Detail & Related papers (2025-03-31T14:20:53Z)
Continuous Cardiac Arrest Prediction in ICU using PPG Foundation Model [6.469423282286416]
Non-invasive patient monitoring for tracking and predicting acute health events is an emerging area of research. We present IHCA prediction results in ICU patients using only unimodal (signal waveform) deep representations. We also provide comprehensive analysis of our model through architectural tuning and PaCMAP visualization of patient health trajectory in latent space.
arXiv Detail & Related papers (2025-02-12T18:01:04Z)
Finetuning and Quantization of EEG-Based Foundational BioSignal Models on ECG and PPG Data for Blood Pressure Estimation [53.2981100111204]
Photoplethysmography and electrocardiography can potentially enable continuous blood pressure (BP) monitoring. Yet accurate and robust machine learning (ML) models remains challenging due to variability in data quality and patient-specific factors. In this work, we investigate whether a model pre-trained on one modality can effectively be exploited to improve the accuracy of a different signal type. Our approach achieves near state-of-the-art accuracy for diastolic BP and surpasses by 1.5x the accuracy of prior works for systolic BP.
arXiv Detail & Related papers (2025-02-10T13:33:12Z)
Pulse-PPG: An Open-Source Field-Trained PPG Foundation Model for Wearable Applications Across Lab and Field Settings [15.077834063212704]
Photoplethysmography-based foundation models are gaining traction due to their potential to generalize across diverse health applications. Existing PPG foundation models are either open-source but trained on clinical data or closed-source, limiting their applicability in real-world settings. We introduce Pulse, the first open-source PPG foundation model trained exclusively on raw PPG data collected over a 100-day field study with 120 participants.
arXiv Detail & Related papers (2025-02-03T06:56:40Z)
SMPLest-X: Ultimate Scaling for Expressive Human Pose and Shape Estimation [81.36747103102459]
Expressive human pose and shape estimation (EHPS) unifies body, hands, and face motion capture with numerous applications. Current state-of-the-art methods focus on training innovative architectural designs on confined datasets. We investigate the impact of scaling up EHPS towards a family of generalist foundation models.
arXiv Detail & Related papers (2025-01-16T18:59:46Z)
EnhancePPG: Improving PPG-based Heart Rate Estimation with Self-Supervision and Augmentation [17.617241860357407]
We present Enhance, a method that enhances state-of-the-art models by integrating self-supervised learning with data augmentation. Inspired by a U-Net-like autoencoder architecture, we utilize unsupervised PPG signal reconstruction, taking advantage of large amounts of unlabeled data. We improve the best HR estimation by 12.2%, lowering from 4.03 Beats-Per-Minute (BPM) to 3.54 BPM error on PPG-DaLiA.
arXiv Detail & Related papers (2024-12-20T13:25:50Z)
SemiHVision: Enhancing Medical Multimodal Models with a Semi-Human Annotated Dataset and Fine-Tuned Instruction Generation [13.672776832197918]
Multimodal large language models (MLLMs) have made significant strides, yet they face challenges in the medical domain due to limited specialized knowledge. We seek to address this gap at various stages of the end-to-end learning pipeline, including data collection, model fine-tuning, and evaluation.
arXiv Detail & Related papers (2024-10-19T02:35:35Z)
Evaluating Computational Pathology Foundation Models for Prostate Cancer Grading under Distribution Shifts [3.2995359570845912]
We evaluate two computational pathology foundation models: UNI (trained on more than 100,000 whole-slide images) and CONCH (trained on more than 1.1 million image-caption pairs) We find that while UNI and CONCH perform well relative to baselines, the absolute performance can still be far from satisfactory in certain settings.
arXiv Detail & Related papers (2024-10-09T09:45:53Z)
Phikon-v2, A large and public feature extractor for biomarker prediction [42.52549987351643]
We train a vision transformer using DINOv2 and publicly release one iteration of this model for further experimentation, coined Phikon-v2. While trained on publicly available histology slides, Phikon-v2 surpasses our previously released model (Phikon) and performs on par with other histopathology foundation models (FM) trained on proprietary data.
arXiv Detail & Related papers (2024-09-13T20:12:29Z)
CanvOI, an Oncology Intelligence Foundation Model: Scaling FLOPS Differently [0.0]
We present CanvOI, a ViT-g/10-based foundation model designed to enhance the capabilities of digital pathology. By introducing larger tile sizes (380 x 380 pixels) and smaller patch sizes (10 x 10 pixels), we were able to optimize the model's performance.
arXiv Detail & Related papers (2024-09-04T17:15:44Z)
Synthesizing Multimodal Electronic Health Records via Predictive Diffusion Models [69.06149482021071]
We propose a novel EHR data generation model called EHRPD. It is a diffusion-based model designed to predict the next visit based on the current one while also incorporating time interval estimation. We conduct experiments on two public datasets and evaluate EHRPD from fidelity, privacy, and utility perspectives.
arXiv Detail & Related papers (2024-06-20T02:20:23Z)
Using Pre-training and Interaction Modeling for ancestry-specific disease prediction in UK Biobank [69.90493129893112]
Recent genome-wide association studies (GWAS) have uncovered the genetic basis of complex traits, but show an under-representation of non-European descent individuals. Here, we assess whether we can improve disease prediction across diverse ancestries using multiomic data.
arXiv Detail & Related papers (2024-04-26T16:39:50Z)
Towards a clinically accessible radiology foundation model: open-access and lightweight, with automated evaluation [113.5002649181103]
Training open-source small multimodal models (SMMs) to bridge competency gaps for unmet clinical needs in radiology. For training, we assemble a large dataset of over 697 thousand radiology image-text pairs. For evaluation, we propose CheXprompt, a GPT-4-based metric for factuality evaluation, and demonstrate its parity with expert evaluation. The inference of LlaVA-Rad is fast and can be performed on a single V100 GPU in private settings, offering a promising state-of-the-art tool for real-world clinical applications.
arXiv Detail & Related papers (2024-03-12T18:12:02Z)
A Meta-GNN approach to personalized seizure detection and classification [53.906130332172324]
We propose a personalized seizure detection and classification framework that quickly adapts to a specific patient from limited seizure samples. We train a Meta-GNN based classifier that learns a global model from a set of training patients. We show that our method outperforms the baselines by reaching 82.7% on accuracy and 82.08% on F1 score after only 20 iterations on new unseen patients.
arXiv Detail & Related papers (2022-11-01T14:12:58Z)
Generalizing electrocardiogram delineation: training convolutional neural networks with synthetic data augmentation [63.51064808536065]
Existing databases for ECG delineation are small, being insufficient in size and in the array of pathological conditions they represent. This article delves has two main contributions. First, a pseudo-synthetic data generation algorithm was developed, based in probabilistically composing ECG traces given "pools" of fundamental segments, as cropped from the original databases, and a set of rules for their arrangement into coherent synthetic traces. Second, two novel segmentation-based loss functions have been developed, which attempt at enforcing the prediction of an exact number of independent structures and at producing closer segmentation boundaries by focusing on a reduced number of samples.
arXiv Detail & Related papers (2021-11-25T10:11:41Z)
A multi-stage machine learning model on diagnosis of esophageal manometry [50.591267188664666]
The framework includes deep-learning models at the swallow-level stage and feature-based machine learning models at the study-level stage. This is the first artificial-intelligence-style model to automatically predict CC diagnosis of HRM study from raw multi-swallow data.
arXiv Detail & Related papers (2021-06-25T20:09:23Z)
Predicting Clinical Diagnosis from Patients Electronic Health Records Using BERT-based Neural Networks [62.9447303059342]
We show the importance of this problem in medical community. We present a modification of Bidirectional Representations from Transformers (BERT) model for classification sequence. We use a large-scale Russian EHR dataset consisting of about 4 million unique patient visits.
arXiv Detail & Related papers (2020-07-15T09:22:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.