Stanford Sleep Bench: Evaluating Polysomnography Pre-training Methods for Sleep Foundation Models
- URL: http://arxiv.org/abs/2512.09591v1
- Date: Wed, 10 Dec 2025 12:37:29 GMT
- Title: Stanford Sleep Bench: Evaluating Polysomnography Pre-training Methods for Sleep Foundation Models
- Authors: Magnus Ruud Kjaer, Rahul Thapa, Gauri Ganjoo, Hyatt Moore, Poul Joergen Jennum, Brandon M. Westover, James Zou, Emmanuel Mignot, Bryan He, Andreas Brink-Kjaer,
- Abstract summary: We release Stanford Sleep Bench, a large-scale PSG dataset comprising 17,467 recordings totaling over 163,000 hours from a major sleep clinic.<n>Our results show that multiple pretraining methods achieve comparable performance for sleep staging, apnea diagnosis, and age estimation.<n>To facilitate and advance sleep research, we will release Stanford Sleep Bench along with pretrained model weights, training pipelines, and evaluation code.
- Score: 18.499236999143474
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Polysomnography (PSG), the gold standard test for sleep analysis, generates vast amounts of multimodal clinical data, presenting an opportunity to leverage self-supervised representation learning (SSRL) for pre-training foundation models to enhance sleep analysis. However, progress in sleep foundation models is hindered by two key limitations: (1) the lack of a shared dataset and benchmark with diverse tasks for training and evaluation, and (2) the absence of a systematic evaluation of SSRL approaches across sleep-related tasks. To address these gaps, we introduce Stanford Sleep Bench, a large-scale PSG dataset comprising 17,467 recordings totaling over 163,000 hours from a major sleep clinic, including 13 clinical disease prediction tasks alongside canonical sleep-related tasks such as sleep staging, apnea diagnosis, and age estimation. We systematically evaluate SSRL pre-training methods on Stanford Sleep Bench, assessing downstream performance across four tasks: sleep staging, apnea diagnosis, age estimation, and disease and mortality prediction. Our results show that multiple pretraining methods achieve comparable performance for sleep staging, apnea diagnosis, and age estimation. However, for mortality and disease prediction, contrastive learning significantly outperforms other approaches while also converging faster during pretraining. To facilitate reproducibility and advance sleep research, we will release Stanford Sleep Bench along with pretrained model weights, training pipelines, and evaluation code.
Related papers
- KindSleep: Knowledge-Informed Diagnosis of Obstructive Sleep Apnea from Oximetry [5.901247752047518]
We introduce KindSleep, a deep learning framework that integrates clinical knowledge with single-channel patient-specific oximetry signals and clinical data for precise OSA diagnosis.<n>KindSleep first learns to identify clinically interpretable concepts, such as desaturation indices and respiratory disturbance events, directly from raw oximetry signals.<n>It then fuses these AI-derived concepts with multimodal clinical data to estimate the Apnea-Hypopnea Index (AHI)
arXiv Detail & Related papers (2026-03-05T03:00:34Z) - Sleep Position Classification using Transfer Learning for Bed-based Pressure Sensors [0.06282171844772422]
Bed-based pressure-sensitive mats (PSMs) offer a non-intrusive way of monitoring patients during sleep.<n>We focus on four-way sleep position classification using data collected from a PSM placed under a mattress in a sleep clinic.
arXiv Detail & Related papers (2025-05-12T22:54:03Z) - Multimodal Sleep Stage and Sleep Apnea Classification Using Vision Transformer: A Multitask Explainable Learning Approach [1.7765306045990206]
We propose a 1D-Vision Transformer for simultaneous classification of sleep stages and sleep disorders.<n>The proposed method shows an overall accuracy (cohen's Kappa) of 78% (0.66) for five-stage sleep classification and 74% (0.58) for sleep apnea classification.
arXiv Detail & Related papers (2025-02-18T15:48:06Z) - Day-Night Adaptation: An Innovative Source-free Adaptation Framework for Medical Image Segmentation [51.520294290813865]
We propose a novel adaptation framework called Day-Night Adaptation (DyNA) with insights.<n>During the day, a low-frequency prompt is trained to adapt the frozen model to each test sample.<n>During the night, we reuse test data collected from the day and introduce a global student model to bridge the knowledge between teacher and student models.
arXiv Detail & Related papers (2024-10-17T12:02:29Z) - What Radio Waves Tell Us about Sleep [34.690382091650314]
We develop an advanced machine learning algorithm for passively monitoring sleep and nocturnal breathing from radio waves reflected off people while asleep.
We show that the model captures the sleep hypnogram (with an accuracy of 81% for 30-second epochs categorized into Wake, Light Sleep, Deep Sleep, or REM) and detects sleep apnea (AUROC = 0.88)
The model uncovers informative interactions between sleep stages and a range of diseases including neurological, psychiatric, cardiovascular, and immunological disorders.
arXiv Detail & Related papers (2024-05-20T02:41:21Z) - Using Pre-training and Interaction Modeling for ancestry-specific disease prediction in UK Biobank [69.90493129893112]
Recent genome-wide association studies (GWAS) have uncovered the genetic basis of complex traits, but show an under-representation of non-European descent individuals.
Here, we assess whether we can improve disease prediction across diverse ancestries using multiomic data.
arXiv Detail & Related papers (2024-04-26T16:39:50Z) - Clustering and Data Augmentation to Improve Accuracy of Sleep Assessment and Sleep Individuality Analysis [1.9662978733004597]
This study aims to construct a machine learning-based sleep assessment model providing evidence-based assessments, such as poor sleep due to frequent movement during sleep onset.
Extracting sleep sound events, deriving latent representations using VAE, clustering with GMM, and training LSTM for subjective sleep assessment achieved a high accuracy of 94.8% in distinguishing sleep satisfaction.
arXiv Detail & Related papers (2024-04-16T05:56:41Z) - Sleep Stage Classification Using a Pre-trained Deep Learning Model [0.0]
"EEGMobile" is a machine-learning model that learns from electroencephalogram (EEG) spectrograms of brain signals.
The model achieved an accuracy of 86.97% on a publicly available dataset named "Sleep-EDF20", outperforming other models proposed by different researchers.
arXiv Detail & Related papers (2023-09-12T23:02:19Z) - Sleep Activity Recognition and Characterization from Multi-Source
Passively Sensed Data [67.60224656603823]
Sleep Activity Recognition methods can provide indicators to assess, monitor, and characterize subjects' sleep-wake cycles and detect behavioral changes.
We propose a general method that continuously operates on passively sensed data from smartphones to characterize sleep and identify significant sleep episodes.
Thanks to their ubiquity, these devices constitute an excellent alternative data source to profile subjects' biorhythms in a continuous, objective, and non-invasive manner.
arXiv Detail & Related papers (2023-01-17T15:18:45Z) - Convolutional Neural Networks for Sleep Stage Scoring on a Two-Channel
EEG Signal [63.18666008322476]
Sleep problems are one of the major diseases all over the world.
Basic tool used by specialists is the Polysomnogram, which is a collection of different signals recorded during sleep.
Specialists have to score the different signals according to one of the standard guidelines.
arXiv Detail & Related papers (2021-03-30T09:59:56Z) - MSED: a multi-modal sleep event detection model for clinical sleep
analysis [62.997667081978825]
We designed a single deep neural network architecture to jointly detect sleep events in a polysomnogram.
The performance of the model was quantified by F1, precision, and recall scores, and by correlating index values to clinical values.
arXiv Detail & Related papers (2021-01-07T13:08:44Z) - Automatic detection of microsleep episodes with deep learning [55.41644538483948]
Brief fragments of sleep shorter than 15 s are defined as microsleep episodes (MSEs)
maintenance of wakefulness test (MWT) is often used in a clinical setting to assess vigilance.
MSEs are mostly not considered in the absence of established scoring criteria defining MSEs.
We aimed for automatic detection of MSEs with machine learning based on raw EEG and EOG data as input.
arXiv Detail & Related papers (2020-09-07T11:38:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.