Related papers: Weighted Temporal Decay Loss for Learning Wearable PPG Data with Sparse Clinical Labels

Weighted Temporal Decay Loss for Learning Wearable PPG Data with Sparse Clinical Labels

URL: http://arxiv.org/abs/2602.02917v1
Date: Mon, 02 Feb 2026 23:43:40 GMT
Title: Weighted Temporal Decay Loss for Learning Wearable PPG Data with Sparse Clinical Labels
Authors: Yunsung Chung, Keum San Chun, Migyeong Gwak, Han Feng, Yingshuo Liu, Chanho Lim, Viswam Nathan, Nassir Marrouche, Sharanya Arcot Desai,
Abstract summary: Training strategy learns a biomarker-specific decay of sample weight over the time gap between a segment and its ground truth label.<n>On smartwatch PPG from 450 participants across 10 biomarkers, the approach improves over baselines.
Score: 4.73280675105624
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Advances in wearable computing and AI have increased interest in leveraging PPG for health monitoring over the past decade. One of the biggest challenges in developing health algorithms based on such biosignals is the sparsity of clinical labels, which makes biosignals temporally distant from lab draws less reliable for supervision. To address this problem, we introduce a simple training strategy that learns a biomarker-specific decay of sample weight over the time gap between a segment and its ground truth label and uses this weight in the loss with a regularizer to prevent trivial solutions. On smartwatch PPG from 450 participants across 10 biomarkers, the approach improves over baselines. In the subject-wise setting, the proposed approach averages 0.715 AUPRC, compared to 0.674 for a fine-tuned self-supervised baseline and 0.626 for a feature-based Random Forest. A comparison of four decay families shows that a simple linear decay function is most robust on average. Beyond accuracy, the learned decay rates summarize how quickly each biomarker's PPG evidence becomes stale, providing an interpretable view of temporal sensitivity.

Related papers

Aortic Valve Disease Detection from PPG via Physiology-Informed Self-Supervised Learning [14.821698474716504]
Photoplethysmography has emerged as a promising screening modality for aortic valve disease.<n>The extreme scarcity of gold-standard labeled PPG data severely constrains the effectiveness of data-driven approaches.<n>We propose and validate a new paradigm, Physiology-Guided Self-Supervised Learning (PG-SSL), aimed at unlocking the value of large-scale unlabeled PPG data.
arXiv Detail & Related papers (2026-02-04T06:56:50Z)
Deep Unsupervised Anomaly Detection in Brain Imaging: Large-Scale Benchmarking and Bias Analysis [42.60508892284938]
We present a large-scale, multi-center benchmark of deep unsupervised anomaly detection for brain imaging.<n>We tested 2,221 T1w and 1,262 T2w scans spanning healthy datasets and diverse clinical cohorts.<n>Our benchmark establishes a transparent foundation for future research and highlights priorities for clinical translation.
arXiv Detail & Related papers (2025-12-01T11:03:27Z)
Assessing the Feasibility of Early Cancer Detection Using Routine Laboratory Data: An Evaluation of Machine Learning Approaches on an Imbalanced Dataset [0.02030567625639093]
The development of accessible screening tools for early cancer detection in dogs represents a significant challenge in veterinary medicine.<n>This study assesses the feasibility of cancer risk classification using the Golden Retriever Lifetime Study cohort under real-world constraints.<n>It is concluded that while a statistically detectable cancer signal exists in routine lab data, it is too weak and confounded for clinically reliable discrimination from normal aging or other inflammatory conditions.
arXiv Detail & Related papers (2025-10-23T04:52:42Z)
RareGraph-Synth: Knowledge-Guided Diffusion Models for Generating Privacy-Preserving Synthetic Patient Trajectories in Ultra-Rare Diseases [0.0]
We propose a knowledge-guided, continuous-time diffusion framework that generates trajectories for ultra-rare diseases.<n>RareGraph- Synth unifies five public resources into a heterogeneous knowledge graph comprising approximately 8 M typed edges.<n>Timestamped sequences of lab-code, medication-code, and adverse-event-flag triples that contain no protected health information are produced.
arXiv Detail & Related papers (2025-10-06T03:59:09Z)
Wav2Arrest 2.0: Long-Horizon Cardiac Arrest Prediction with Time-to-Event Modeling, Identity-Invariance, and Pseudo-Lab Alignment [5.706374608871095]
High-frequency physiological waveform modality offers deep, real-time insights into patient status.<n>Recently, physiological foundation models have been shown to predict critical events, including Cardiac Arrest.<n>We offer three improvements to improve PPG-only CA systems by using minimal auxiliary information.
arXiv Detail & Related papers (2025-09-25T23:46:39Z)
Temporal Vegetation Index-Based Unsupervised Crop Stress Detection via Eigenvector-Guided Contrastive Learning [0.0]
EigenCL is an unsupervised contrastive learning framework guided by temporal NDRE dynamics.<n>It is suitable for real-world deployment in data-scarce agricultural environments.
arXiv Detail & Related papers (2025-06-03T21:06:26Z)
Finetuning and Quantization of EEG-Based Foundational BioSignal Models on ECG and PPG Data for Blood Pressure Estimation [46.36100528165335]
Photoplethysmography and electrocardiography can potentially enable continuous blood pressure (BP) monitoring.<n>Yet accurate and robust machine learning (ML) models remains challenging due to variability in data quality and patient-specific factors.<n>In this work, we investigate whether a model pre-trained on one modality can effectively be exploited to improve the accuracy of a different signal type.<n>Our approach achieves near state-of-the-art accuracy for diastolic BP and surpasses by 1.5x the accuracy of prior works for systolic BP.
arXiv Detail & Related papers (2025-02-10T13:33:12Z)
Synthetic Time Series Data Generation for Healthcare Applications: A PCG Case Study [43.28613210217385]
We employ and compare three state-of-the-art generative models to generate PCG data.<n>Our results demonstrate that the generated PCG data closely resembles the original datasets.<n>In our future work, we plan to incorporate this method into a data augmentation pipeline to synthesize abnormal PCG signals with heart murmurs.
arXiv Detail & Related papers (2024-12-17T18:07:40Z)
Learning to diagnose cirrhosis from radiological and histological labels with joint self and weakly-supervised pretraining strategies [62.840338941861134]
We propose to leverage transfer learning from large datasets annotated by radiologists, to predict the histological score available on a small annex dataset. We compare different pretraining methods, namely weakly-supervised and self-supervised ones, to improve the prediction of the cirrhosis. This method outperforms the baseline classification of the METAVIR score, reaching an AUC of 0.84 and a balanced accuracy of 0.75.
arXiv Detail & Related papers (2023-02-16T17:06:23Z)
Undersampling and Cumulative Class Re-decision Methods to Improve Detection of Agitation in People with Dementia [16.949993123698345]
Agitation is one of the most prevalent symptoms in people with dementia (PwD) In a previous study, we collected multimodal wearable sensor data from 17 participants for 600 days and developed machine learning models for detecting agitation in one-minute windows. In this paper, we first implemented different undersampling methods to eliminate the imbalance problem, and came to the conclusion that only 20% of normal behaviour data were adequate to train a competitive agitation detection model.
arXiv Detail & Related papers (2023-02-07T03:14:00Z)
Lung Cancer Lesion Detection in Histopathology Images Using Graph-Based Sparse PCA Network [93.22587316229954]
We propose a graph-based sparse principal component analysis (GS-PCA) network, for automated detection of cancerous lesions on histological lung slides stained by hematoxylin and eosin (H&E) We evaluate the performance of the proposed algorithm on H&E slides obtained from an SVM K-rasG12D lung cancer mouse model using precision/recall rates, F-score, Tanimoto coefficient, and area under the curve (AUC) of the receiver operator characteristic (ROC)
arXiv Detail & Related papers (2021-10-27T19:28:36Z)
Deep Metric Learning with Locality Sensitive Angular Loss for Self-Correcting Source Separation of Neural Spiking Signals [77.34726150561087]
We propose a methodology based on deep metric learning to address the need for automated post-hoc cleaning and robust separation filters. We validate this method with an artificially corrupted label set based on source-separated high-density surface electromyography recordings. This approach enables a neural network to learn to accurately decode neurophysiological time series using any imperfect method of labelling the signal.
arXiv Detail & Related papers (2021-10-13T21:51:56Z)
Multilabel 12-Lead Electrocardiogram Classification Using Gradient Boosting Tree Ensemble [64.29529357862955]
We build an algorithm using gradient boosted tree ensembles fitted on morphology and signal processing features to classify ECG diagnosis. For each lead, we derive features from heart rate variability, PQRST template shape, and the full signal waveform. We join the features of all 12 leads to fit an ensemble of gradient boosting decision trees to predict probabilities of ECG instances belonging to each class.
arXiv Detail & Related papers (2020-10-21T18:11:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.