Exploring Finetuned Audio-LLM on Heart Murmur Features
- URL: http://arxiv.org/abs/2501.13884v1
- Date: Thu, 23 Jan 2025 17:57:18 GMT
- Title: Exploring Finetuned Audio-LLM on Heart Murmur Features
- Authors: Adrian Florea, Xilin Jiang, Nima Mesgarani, Xiaofan Jiang,
- Abstract summary: Large language models (LLMs) for audio have excelled in recognizing and analyzing human speech, music, and environmental sounds.
In this study, we focus on diagnosing cardiovascular diseases using phonocardiograms, i.e., heart sounds.
- Score: 13.529024158003233
- License:
- Abstract: Large language models (LLMs) for audio have excelled in recognizing and analyzing human speech, music, and environmental sounds. However, their potential for understanding other types of sounds, particularly biomedical sounds, remains largely underexplored despite significant scientific interest. In this study, we focus on diagnosing cardiovascular diseases using phonocardiograms, i.e., heart sounds. Most existing deep neural network (DNN) paradigms are restricted to heart murmur classification (healthy vs unhealthy) and do not predict other acoustic features of the murmur such as timing, grading, harshness, pitch, and quality, which are important in helping physicians diagnose the underlying heart conditions. We propose to finetune an audio LLM, Qwen2-Audio, on the PhysioNet CirCor DigiScope phonocardiogram (PCG) dataset and evaluate its performance in classifying 11 expert-labeled murmur features. Additionally, we aim to achieve more noise-robust and generalizable system by exploring a preprocessing segmentation algorithm using an audio representation model, SSAMBA. Our results indicate that the LLM-based model outperforms state-of-the-art methods in 8 of the 11 features and performs comparably in the remaining 3. Moreover, the LLM successfully classifies long-tail murmur features with limited training data, a task that all previous methods have failed to classify. These findings underscore the potential of audio LLMs as assistants to human cardiologists in enhancing heart disease diagnosis.
Related papers
- Model-driven Heart Rate Estimation and Heart Murmur Detection based on Phonocardiogram [4.5546756241897235]
This study utilizes a publicly available phonocardiogram (PCG) dataset to estimate heart rate.
We extend the best-performing model to a multi-task learning framework for simultaneous heart rate estimation and murmur detection.
arXiv Detail & Related papers (2024-07-25T22:56:21Z) - Heart Sound Segmentation Using Deep Learning Techniques [0.0]
This paper presents a novel approach for heart sound segmentation and classification into S1 (LUB) and S2 (DUB) sounds.
We employ FFT-based filtering, dynamic programming for event detection, and a Siamese network for robust classification.
Our method demonstrates superior performance on the PASCAL heart sound dataset compared to existing approaches.
arXiv Detail & Related papers (2024-06-09T05:30:05Z) - Dr-LLaVA: Visual Instruction Tuning with Symbolic Clinical Grounding [53.629132242389716]
Vision-Language Models (VLM) can support clinicians by analyzing medical images and engaging in natural language interactions.
VLMs often exhibit "hallucinogenic" behavior, generating textual outputs not grounded in contextual multimodal information.
We propose a new alignment algorithm that uses symbolic representations of clinical reasoning to ground VLMs in medical knowledge.
arXiv Detail & Related papers (2024-05-29T23:19:28Z) - A Generalizable Deep Learning System for Cardiac MRI [29.429744474335347]
We describe a foundational vision system for cardiac MRI, capable of representing the breadth of human cardiovascular disease and health.
Our deep learning model is trained via self-supervised contrastive learning, by which visual concepts in cine-sequence cardiac MRI scans are learned from the raw text of the accompanying radiology reports.
We show that our deep learning system is capable of not only understanding the staggering complexity of human cardiovascular disease, but can be directed towards clinical problems of interest yielding impressive, clinical grade diagnostic accuracy with a fraction of the training data typically required for such tasks.
arXiv Detail & Related papers (2023-12-01T05:27:29Z) - Show from Tell: Audio-Visual Modelling in Clinical Settings [58.88175583465277]
We consider audio-visual modelling in a clinical setting, providing a solution to learn medical representations without human expert annotation.
A simple yet effective multi-modal self-supervised learning framework is proposed for this purpose.
The proposed approach is able to localise anatomical regions of interest during ultrasound imaging, with only speech audio as a reference.
arXiv Detail & Related papers (2023-10-25T08:55:48Z) - Deep Feature Learning for Medical Acoustics [78.56998585396421]
The purpose of this paper is to compare different learnables in medical acoustics tasks.
A framework has been implemented to classify human respiratory sounds and heartbeats in two categories, i.e. healthy or affected by pathologies.
arXiv Detail & Related papers (2022-08-05T10:39:37Z) - MyoPS: A Benchmark of Myocardial Pathology Segmentation Combining
Three-Sequence Cardiac Magnetic Resonance Images [84.02849948202116]
This work defines a new task of medical image analysis, i.e., to perform myocardial pathology segmentation (MyoPS)
MyoPS combines three-sequence cardiac magnetic resonance (CMR) images, which was first proposed in the MyoPS challenge, in conjunction with MICCAI 2020.
The challenge provided 45 paired and pre-aligned CMR images, allowing algorithms to combine the complementary information from the three CMR sequences for pathology segmentation.
arXiv Detail & Related papers (2022-01-10T06:37:23Z) - Segmentation-free Heart Pathology Detection Using Deep Learning [12.065014651638943]
We propose a novel segmentation-free heart sound classification method.
Specifically, we apply discrete wavelet transform to denoise the signal, followed by feature extraction and feature reduction.
Support Vector Machines and Deep Neural Networks are utilised for classification.
arXiv Detail & Related papers (2021-08-09T16:09:30Z) - A Visual Domain Transfer Learning Approach for Heartbeat Sound
Classification [0.0]
Heart disease is the most common reason for human mortality that causes almost one-third of deaths throughout the world.
Detecting the disease early increases the chances of survival of the patient and there are several ways a sign of heart disease can be detected early.
This research proposes to convert cleansed and normalized heart sound into visual mel scale spectrograms and then using visual domain transfer learning approaches to automatically extract features and categorize between heart sounds.
arXiv Detail & Related papers (2021-07-28T09:41:38Z) - Noise-Resilient Automatic Interpretation of Holter ECG Recordings [67.59562181136491]
We present a three-stage process for analysing Holter recordings with robustness to noisy signal.
First stage is a segmentation neural network (NN) with gradientdecoder architecture which detects positions of heartbeats.
Second stage is a classification NN which will classify heartbeats as wide or narrow.
Third stage is a boosting decision trees (GBDT) on top of NN features that incorporates patient-wise features.
arXiv Detail & Related papers (2020-11-17T16:15:49Z) - A Global Benchmark of Algorithms for Segmenting Late Gadolinium-Enhanced
Cardiac Magnetic Resonance Imaging [90.29017019187282]
" 2018 Left Atrium Challenge" using 154 3D LGE-MRIs, currently the world's largest cardiac LGE-MRI dataset.
Analyse of the submitted algorithms using technical and biological metrics was performed.
Results show the top method achieved a dice score of 93.2% and a mean surface to a surface distance of 0.7 mm.
arXiv Detail & Related papers (2020-04-26T08:49:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.