H-LDM: Hierarchical Latent Diffusion Models for Controllable and Interpretable PCG Synthesis from Clinical Metadata
- URL: http://arxiv.org/abs/2511.14312v1
- Date: Tue, 18 Nov 2025 10:16:22 GMT
- Title: H-LDM: Hierarchical Latent Diffusion Models for Controllable and Interpretable PCG Synthesis from Clinical Metadata
- Authors: Chenyang Xu, Siming Li, Hao Wang,
- Abstract summary: H-LDM is a Hierarchical Latent Diffusion Model for generating clinically accurate and controllable PCG signals from structured metadata.<n>Our approach features a multi-scale VAE that learns a physiologically-disentangled latent space, separating rhythm, heart sounds, and murmurs.<n> Experiments on the PhysioNet CirCor dataset demonstrate state-the-art performance, achieving a Fréchet Audio Distance of 9.7, a 92% attribute disentanglement score, and 87.1% clinical validity confirmed by cardiologists.
- Score: 7.158541967057649
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Phonocardiogram (PCG) analysis is vital for cardiovascular disease diagnosis, yet the scarcity of labeled pathological data hinders the capability of AI systems. To bridge this, we introduce H-LDM, a Hierarchical Latent Diffusion Model for generating clinically accurate and controllable PCG signals from structured metadata. Our approach features: (1) a multi-scale VAE that learns a physiologically-disentangled latent space, separating rhythm, heart sounds, and murmurs; (2) a hierarchical text-to-biosignal pipeline that leverages rich clinical metadata for fine-grained control over 17 distinct conditions; and (3) an interpretable diffusion process guided by a novel Medical Attention module. Experiments on the PhysioNet CirCor dataset demonstrate state-of-the-art performance, achieving a Fréchet Audio Distance of 9.7, a 92% attribute disentanglement score, and 87.1% clinical validity confirmed by cardiologists. Augmenting diagnostic models with our synthetic data improves the accuracy of rare disease classification by 11.3\%. H-LDM establishes a new direction for data augmentation in cardiac diagnostics, bridging data scarcity with interpretable clinical insights.
Related papers
- A WDLoRA-Based Multimodal Generative Framework for Clinically Guided Corneal Confocal Microscopy Image Synthesis in Diabetic Neuropathy [8.701084151107652]
Corneal Confocal Microscopy is a sensitive tool for assessing small-fiber damage in Diabetic Peripheral Neuropathy (DPN)<n>Development of robust, automated deep learning-based diagnostic models is limited by scarce labelled data and fine-grained variability in corneal nerve morphology.<n>We propose a Weight-Decomposed Low-Rank Adaptation (WDLoRA)-based multimodal generative framework for clinically guided CCM image synthesis.
arXiv Detail & Related papers (2026-02-14T09:32:44Z) - Lightweight Classifier for Detecting Intracranial Hemorrhage in Ultrasound Data [0.5461938536945722]
Intracranial hemorrhage (ICH) secondary to Traumatic Brain Injury (TBI) represents a critical diagnostic challenge.<n>Current diagnostic modalities including Computed Tomography (CT) and Magnetic Resonance Imaging (MRI) have significant limitations.<n>This study investigates machine learning approaches for automated ICH detection using Ultrasound Tissue Pulsatility Imaging (TPI)
arXiv Detail & Related papers (2025-10-22T09:04:42Z) - Automated Labeling of Intracranial Arteries with Uncertainty Quantification Using Deep Learning [2.6279333406008476]
We present a deep learning-based framework for automated artery labeling from 3D Time-of-Flight Magnetic Resonance Angiography (3D ToF-MRA)<n>Our framework offers a scalable, accurate, and uncertainty-aware solution for automated cerebrovascular labeling, supporting downstream hemodynamic analysis and facilitating clinical integration.
arXiv Detail & Related papers (2025-09-22T12:57:21Z) - LGE-Guided Cross-Modality Contrastive Learning for Gadolinium-Free Cardiomyopathy Screening in Cine CMR [51.11296719862485]
We propose a Contrastive Learning and Cross-Modal alignment framework for gadolinium-free cardiomyopathy screening using cine CMR sequences.<n>By aligning the latent spaces of cine CMR and Late Gadolinium Enhancement (LGE) sequences, our model encodes fibrosis-specific pathology into cine CMR embeddings.
arXiv Detail & Related papers (2025-08-23T07:21:23Z) - Organ-Agents: Virtual Human Physiology Simulator via LLMs [66.40796430669158]
Organ-Agents is a multi-agent framework that simulates human physiology via LLM-driven agents.<n>We curated data from 7,134 sepsis patients and 7,895 controls, generating high-resolution trajectories across 9 systems and 125 variables.<n>Organ-Agents achieved high simulation accuracy on 4,509 held-out patients, with per-system MSEs 0.16 and robustness across SOFA-based severity strata.
arXiv Detail & Related papers (2025-08-20T01:58:45Z) - Acoustic Index: A Novel AI-Driven Parameter for Cardiac Disease Risk Stratification Using Echocardiography [0.0]
We introduce the Acoustic Index, a novel AI-derived echocardiographic parameter designed to quantify cardiac dysfunction from standard ultrasound views.<n>The model combines Extended Dynamic Mode Decomposition (EDMD) based on Koopman operator theory with a hybrid neural network that incorporates clinical metadata.<n>In a prospective cohort of 736 patients, encompassing various cardiac pathologies and normal controls, the Acoustic Index achieved an area under the curve (AUC) of 0.89 in an independent test set.<n>Cross-validation across five folds confirmed the robustness of the model, showing that both sensitivity and specificity exceeded 0.8 when evaluated on independent data.
arXiv Detail & Related papers (2025-07-17T21:27:28Z) - Explainable Parallel CNN-LSTM Model for Differentiating Ventricular Tachycardia from Supraventricular Tachycardia with Aberrancy in 12-Lead ECGs [4.263117296632119]
We propose a computationally efficient deep learning solution to improve diagnostic accuracy and provide model interpretability for clinical deployment.<n>A novel lightweight parallel deep architecture is introduced. Each pipeline processes individual ECG leads using two 1D-CNN blocks to extract local features.<n>The model achieved $95.63%$ accuracy ($95%$ CI: $93.07-98.19%$), with sensitivity=$95.10%$, specificity=$96.06%$, and F1-score=$95.12%$.
arXiv Detail & Related papers (2025-07-14T12:12:34Z) - Finetuning and Quantization of EEG-Based Foundational BioSignal Models on ECG and PPG Data for Blood Pressure Estimation [46.36100528165335]
Photoplethysmography and electrocardiography can potentially enable continuous blood pressure (BP) monitoring.<n>Yet accurate and robust machine learning (ML) models remains challenging due to variability in data quality and patient-specific factors.<n>In this work, we investigate whether a model pre-trained on one modality can effectively be exploited to improve the accuracy of a different signal type.<n>Our approach achieves near state-of-the-art accuracy for diastolic BP and surpasses by 1.5x the accuracy of prior works for systolic BP.
arXiv Detail & Related papers (2025-02-10T13:33:12Z) - AXIAL: Attention-based eXplainability for Interpretable Alzheimer's Localized Diagnosis using 2D CNNs on 3D MRI brain scans [43.06293430764841]
This study presents an innovative method for Alzheimer's disease diagnosis using 3D MRI designed to enhance the explainability of model decisions.
Our approach adopts a soft attention mechanism, enabling 2D CNNs to extract volumetric representations.
With voxel-level precision, our method identified which specific areas are being paid attention to, identifying these predominant brain regions.
arXiv Detail & Related papers (2024-07-02T16:44:00Z) - Respiratory Disease Classification and Biometric Analysis Using Biosignals from Digital Stethoscopes [3.2458203725405976]
This work presents a novel approach leveraging digital stethoscope technology for automatic respiratory disease classification and biometric analysis.
By leveraging one of the largest publicly available medical database of respiratory sounds, we train machine learning models to classify various respiratory health conditions.
Our approach achieves high accuracy in both binary classification (89% balanced accuracy for healthy vs. diseased) and multi-class classification (72% balanced accuracy for specific diseases like pneumonia and COPD)
arXiv Detail & Related papers (2023-09-12T23:54:00Z) - Learning to diagnose cirrhosis from radiological and histological labels
with joint self and weakly-supervised pretraining strategies [62.840338941861134]
We propose to leverage transfer learning from large datasets annotated by radiologists, to predict the histological score available on a small annex dataset.
We compare different pretraining methods, namely weakly-supervised and self-supervised ones, to improve the prediction of the cirrhosis.
This method outperforms the baseline classification of the METAVIR score, reaching an AUC of 0.84 and a balanced accuracy of 0.75.
arXiv Detail & Related papers (2023-02-16T17:06:23Z) - G-MIND: An End-to-End Multimodal Imaging-Genetics Framework for
Biomarker Identification and Disease Classification [49.53651166356737]
We propose a novel deep neural network architecture to integrate imaging and genetics data, as guided by diagnosis, that provides interpretable biomarkers.
We have evaluated our model on a population study of schizophrenia that includes two functional MRI (fMRI) paradigms and Single Nucleotide Polymorphism (SNP) data.
arXiv Detail & Related papers (2021-01-27T19:28:04Z) - Identification of Ischemic Heart Disease by using machine learning
technique based on parameters measuring Heart Rate Variability [50.591267188664666]
In this study, 18 non-invasive features (age, gender, left ventricular ejection fraction and 15 obtained from HRV) of 243 subjects were used to train and validate a series of several ANN.
The best result was obtained using 7 input parameters and 7 hidden nodes with an accuracy of 98.9% and 82% for the training and validation dataset.
arXiv Detail & Related papers (2020-10-29T19:14:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.