Wearable Accelerometer Foundation Models for Health via Knowledge Distillation
- URL: http://arxiv.org/abs/2412.11276v2
- Date: Fri, 31 Jan 2025 17:35:20 GMT
- Title: Wearable Accelerometer Foundation Models for Health via Knowledge Distillation
- Authors: Salar Abbaspourazad, Anshuman Mishra, Joseph Futoma, Andrew C. Miller, Ian Shapiro,
- Abstract summary: We show that an accelerometry foundation model can predict a wide variety of health targets.
We distill representational knowledge from PPG encoders to accelerometery encoders using 20 million minutes of unlabeled data.
We observe strong cross-modal alignment on unseen data, e.g., 99.2% top-1 accuracy for retrieving PPG embedding from accelerometry embeddings.
- Score: 2.0472158451829827
- License:
- Abstract: Modern wearable devices can conveniently record various biosignals in the many different environments of daily living, enabling a rich view of individual health. However, not all biosignals are the same: high-fidelity biosignals, such as photoplethysmogram (PPG), contain more physiological information, but require optical sensors with a high power footprint. Alternatively, a lower-fidelity biosignal such as accelerometry has a significantly smaller power footprint and is available in almost any wearable device. While accelerometry is widely used for activity recognition and fitness, it is less explored for health biomarkers and diagnosis. Here, we show that an accelerometry foundation model can predict a wide variety of health targets. To achieve improved performance, we distill representational knowledge from PPG encoders to accelerometery encoders using 20 million minutes of unlabeled data, collected from ~172K participants in the Apple Heart and Movement Study under informed consent. We observe strong cross-modal alignment on unseen data, e.g., 99.2% top-1 accuracy for retrieving PPG embeddings from accelerometry embeddings. We show that distilled accelerometry encoders have significantly more informative representations compared to self-supervised or supervised encoders trained directly on accelerometry data, observed by at least 23%-49% improved performance for predicting heart rate and heart rate variability. We also show that distilled accelerometry encoders are readily predictive of a wide array of downstream health targets, i.e., they are generalist foundation models. We believe accelerometry foundation models for health may unlock new opportunities for developing digital biomarkers from any wearable device.
Related papers
- emg2qwerty: A Large Dataset with Baselines for Touch Typing using Surface Electromyography [47.160223334501126]
emg2qwerty is a large-scale dataset of non-invasive electromyographic signals recorded at the wrists while touch typing on a QWERTY keyboard.
With 1,135 sessions spanning 108 users and 346 hours of recording, this is the largest such public dataset to date.
We show strong baseline performance on predicting key-presses using sEMG signals alone.
arXiv Detail & Related papers (2024-10-26T05:18:48Z) - Scaling Wearable Foundation Models [54.93979158708164]
We investigate the scaling properties of sensor foundation models across compute, data, and model size.
Using a dataset of up to 40 million hours of in-situ heart rate, heart rate variability, electrodermal activity, accelerometer, skin temperature, and altimeter per-minute data from over 165,000 people, we create LSM.
Our results establish the scaling laws of LSM for tasks such as imputation, extrapolation, both across time and sensor modalities.
arXiv Detail & Related papers (2024-10-17T15:08:21Z) - Large-scale Training of Foundation Models for Wearable Biosignals [1.8291790356553643]
Tracking biosignals is crucial for monitoring wellness and preempting the development of severe medical conditions.
Despite wearable and existing digital biomarkers, the absence of data with labels hinders the development of new biomarkers.
We train foundation models for two common biosignals: photo movement and electrocardiogram.
arXiv Detail & Related papers (2023-12-08T23:44:34Z) - Remote Bio-Sensing: Open Source Benchmark Framework for Fair Evaluation
of rPPG [2.82697733014759]
r (pg photoplethysmography) is a technology that measures and analyzes BVP (Blood Volume Pulse) by using the light absorption characteristics of hemoglobin captured through a camera.
This study is to provide a framework to evaluate various r benchmarking techniques across a wide range of datasets for fair evaluation and comparison.
arXiv Detail & Related papers (2023-07-24T09:35:47Z) - A marker-less human motion analysis system for motion-based biomarker
discovery in knee disorders [60.99112047564336]
The NHS has been having increased difficulty seeing all low-risk patients, this includes but not limited to suspected osteoarthritis (OA) patients.
We propose a novel method of automated biomarker identification for diagnosis of knee disorders and the monitoring of treatment progression.
arXiv Detail & Related papers (2023-04-26T16:47:42Z) - PulseImpute: A Novel Benchmark Task for Pulsative Physiological Signal
Imputation [54.839600943189915]
Mobile Health (mHealth) is the ability to use wearable sensors to monitor participant physiology at high frequencies during daily life to enable temporally-precise health interventions.
Despite a rich imputation literature, existing techniques are ineffective for the pulsative signals which comprise many mHealth applications.
We address this gap with PulseImpute, the first large-scale pulsative signal imputation challenge which includes realistic mHealth missingness models, an extensive set of baselines, and clinically-relevant downstream tasks.
arXiv Detail & Related papers (2022-12-14T21:39:15Z) - SCAMPS: Synthetics for Camera Measurement of Physiological Signals [17.023803380199492]
We present SCAMPS, a dataset of synthetics containing 2,800 videos (1.68M frames) with aligned cardiac and respiratory signals and facial action intensities.
We provide descriptive statistics about the underlying waveforms, including inter-beat interval, heart rate variability, and pulse arrival time.
arXiv Detail & Related papers (2022-06-08T23:48:41Z) - Label scarcity in biomedicine: Data-rich latent factor discovery
enhances phenotype prediction [102.23901690661916]
Low-dimensional embedding spaces can be derived from the UK Biobank population dataset to enhance data-scarce prediction of health indicators, lifestyle and demographic characteristics.
Performances gains from semisupervison approaches will probably become an important ingredient for various medical data science applications.
arXiv Detail & Related papers (2021-10-12T16:25:50Z) - An Accurate Non-accelerometer-based PPG Motion Artifact Removal
Technique using CycleGAN [2.6353710888820308]
This paper proposes a low-power non-accelerometer-based PPG motion artifacts removal method.
We use Cycle Generative Adversarial Network to reconstruct clean PPG signals from noisy PPG signals.
arXiv Detail & Related papers (2021-06-22T03:00:11Z) - Learning Generalizable Physiological Representations from Large-scale
Wearable Data [12.863826659440026]
We present a novel self-supervised representation learning method using activity and heart rate (HR) signals without semantic labels.
We show that the resulting embeddings can generalize in various downstream tasks through transfer learning with linear classifiers.
Overall, we propose the first multimodal self-supervised method for behavioral and physiological data with implications for large-scale health and lifestyle monitoring.
arXiv Detail & Related papers (2020-11-09T17:56:03Z) - Video-based Remote Physiological Measurement via Cross-verified Feature
Disentangling [121.50704279659253]
We propose a cross-verified feature disentangling strategy to disentangle the physiological features with non-physiological representations.
We then use the distilled physiological features for robust multi-task physiological measurements.
The disentangled features are finally used for the joint prediction of multiple physiological signals like average HR values and r signals.
arXiv Detail & Related papers (2020-07-16T09:39:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.