SS-DPPN: A self-supervised dual-path foundation model for the generalizable cardiac audio representation
- URL: http://arxiv.org/abs/2510.10719v1
- Date: Sun, 12 Oct 2025 17:43:57 GMT
- Title: SS-DPPN: A self-supervised dual-path foundation model for the generalizable cardiac audio representation
- Authors: Ummy Maria Muna, Md Mehedi Hasan Shawon, Md Jobayer, Sumaiya Akter, Md Rakibul Hasan, Md. Golam Rabiul Alam,
- Abstract summary: Self-Supervised Dual-Path Prototypical Network (SS-DPPN) is a foundation model for cardiac audio representation and classification from unlabeled data.<n>SS-DPPN achieves state-of-the-art performance on four cardiac audio benchmarks.
- Score: 2.013977297550879
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The automated analysis of phonocardiograms is vital for the early diagnosis of cardiovascular disease, yet supervised deep learning is often constrained by the scarcity of expert-annotated data. In this paper, we propose the Self-Supervised Dual-Path Prototypical Network (SS-DPPN), a foundation model for cardiac audio representation and classification from unlabeled data. The framework introduces a dual-path contrastive learning based architecture that simultaneously processes 1D waveforms and 2D spectrograms using a novel hybrid loss. For the downstream task, a metric-learning approach using a Prototypical Network was used that enhances sensitivity and produces well-calibrated and trustworthy predictions. SS-DPPN achieves state-of-the-art performance on four cardiac audio benchmarks. The framework demonstrates exceptional data efficiency with a fully supervised model on three-fold reduction in labeled data. Finally, the learned representations generalize successfully across lung sound classification and heart rate estimation. Our experiments and findings validate SS-DPPN as a robust, reliable, and scalable foundation model for physiological signals.
Related papers
- BEAT-Net: Injecting Biomimetic Spatio-Temporal Priors for Interpretable ECG Classification [1.3909285316906435]
BEAT-Net is a Biomimetic ECG Analysis with Tokenization framework.<n>It decomposes cardiac physiology through specialized encoders that extract local beat morphology.<n>It exhibits exceptional data efficiency, recovering fully supervised performance using only 30 to 35 percent of annotated data.
arXiv Detail & Related papers (2026-01-12T08:37:47Z) - FDP: A Frequency-Decomposition Preprocessing Pipeline for Unsupervised Anomaly Detection in Brain MRI [44.4791295950757]
We develop an unsupervised anomaly detection (UAD) approach for brain MRI.<n>We conduct the first systematic frequency-domain analysis of pathological signatures.<n>We show that Frequency-Decomposition Preprocessing (FDP) framework can leverage frequency-domain reconstruction for simultaneous pathology suppression and anatomical preservation.
arXiv Detail & Related papers (2025-11-17T02:40:14Z) - Enhancing ECG Classification Robustness with Lightweight Unsupervised Anomaly Detection Filters [39.9470953186283]
Continuous electrocardiogram (ECG) monitoring via wearables offers significant potential for early cardiovascular disease (CVD) detection.<n> deploying deep learning models for automated analysis in resource-constrained environments faces reliability challenges due to Out-of-Distribution data.<n>This paper explores Unsupervised Anomaly Detection (UAD) as an independent, upstream filtering mechanism to improve robustness.
arXiv Detail & Related papers (2025-10-30T13:54:37Z) - WaveNet's Precision in EEG Classification [1.0885910878567457]
This study introduces a WaveNet-based deep learning model designed to automate the classification of EEG signals into physiological, pathological, artifact, and noise categories.<n>The model was trained, validated, and tested on 209,232 samples with a 70/20/10 percent split.<n>WaveNet's architecture, originally developed for raw audio synthesis, is well suited for EEG data due to its use of dilated causal convolutions and residual connections.
arXiv Detail & Related papers (2025-10-10T09:21:21Z) - Synthetic Time Series Data Generation for Healthcare Applications: A PCG Case Study [43.28613210217385]
We employ and compare three state-of-the-art generative models to generate PCG data.<n>Our results demonstrate that the generated PCG data closely resembles the original datasets.<n>In our future work, we plan to incorporate this method into a data augmentation pipeline to synthesize abnormal PCG signals with heart murmurs.
arXiv Detail & Related papers (2024-12-17T18:07:40Z) - HeartBERT: A Self-Supervised ECG Embedding Model for Efficient and Effective Medical Signal Analysis [0.0]
HeartBert is inspired by Bidirectional Representations from Transformers (BERT) in natural language processing and enhanced with a self-supervised learning approach.<n>To demonstrate the versatility, generalizability, and efficiency of the proposed model, two key downstream tasks have been selected: sleep stage detection and heartbeat classification.<n>A series of practical experiments have been conducted to demonstrate the superiority and advancements of HeartBERT.
arXiv Detail & Related papers (2024-11-08T14:25:00Z) - CTPD: Cross-Modal Temporal Pattern Discovery for Enhanced Multimodal Electronic Health Records Analysis [46.56667527672019]
We introduce a Cross-Modal Temporal Pattern Discovery (CTPD) framework, designed to efficiently extract meaningful cross-modal temporal patterns from multimodal EHR data.<n>Our approach introduces shared initial temporal pattern representations which are refined using slot attention to generate temporal semantic embeddings.
arXiv Detail & Related papers (2024-11-01T15:54:07Z) - Improving Diffusion Models for ECG Imputation with an Augmented Template
Prior [43.6099225257178]
noisy and poor-quality recordings are a major issue for signals collected using mobile health systems.
Recent studies have explored the imputation of missing values in ECG with probabilistic time-series models.
We present a template-guided denoising diffusion probabilistic model (DDPM), PulseDiff, which is conditioned on an informative prior for a range of health conditions.
arXiv Detail & Related papers (2023-10-24T11:34:15Z) - Extraction of volumetric indices from echocardiography: which deep
learning solution for clinical use? [6.144041824426555]
We show that the proposed 3D nnU-Net outperforms alternative 2D and recurrent segmentation methods.
Overall, the experimental results suggest that with sufficient training data, 3D nnU-Net could become the first automated tool to meet the standards of an everyday clinical device.
arXiv Detail & Related papers (2023-05-03T09:38:52Z) - Generalizing electrocardiogram delineation: training convolutional
neural networks with synthetic data augmentation [63.51064808536065]
Existing databases for ECG delineation are small, being insufficient in size and in the array of pathological conditions they represent.
This article delves has two main contributions. First, a pseudo-synthetic data generation algorithm was developed, based in probabilistically composing ECG traces given "pools" of fundamental segments, as cropped from the original databases, and a set of rules for their arrangement into coherent synthetic traces.
Second, two novel segmentation-based loss functions have been developed, which attempt at enforcing the prediction of an exact number of independent structures and at producing closer segmentation boundaries by focusing on a reduced number of samples.
arXiv Detail & Related papers (2021-11-25T10:11:41Z) - Multi-Lead ECG Classification via an Information-Based Attention
Convolutional Neural Network [1.1720399305661802]
One-dimensional convolutional neural networks (CNN) have proven to be effective in pervasive classification tasks.
We implement the Residual connection and design a structure which can learn the weights from the information contained in different channels in the input feature map.
An indicator named mean square deviation is introduced to monitor the performance of a particular model segment in the classification task.
arXiv Detail & Related papers (2020-03-25T02:28:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.