Rene: A Pre-trained Multi-modal Architecture for Auscultation of Respiratory Diseases
- URL: http://arxiv.org/abs/2405.07442v2
- Date: Fri, 7 Jun 2024 00:01:23 GMT
- Title: Rene: A Pre-trained Multi-modal Architecture for Auscultation of Respiratory Diseases
- Authors: Pengfei Zhang, Zhihang Zheng, Shichen Zhang, Minghao Yang, Shaojun Tang,
- Abstract summary: We introduce Rene, a pioneering large-scale model tailored for respiratory sound recognition.
Our innovative approach applies a pre-trained speech recognition model to process respiratory sounds.
We have developed a real-time respiratory sound discrimination system utilizing the Rene architecture.
- Score: 5.810320353233697
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Compared with invasive examinations that require tissue sampling, respiratory sound testing is a non-invasive examination method that is safer and easier for patients to accept. In this study, we introduce Rene, a pioneering large-scale model tailored for respiratory sound recognition. Rene has been rigorously fine-tuned with an extensive dataset featuring a broad array of respiratory audio samples, targeting disease detection, sound pattern classification, and event identification. Our innovative approach applies a pre-trained speech recognition model to process respiratory sounds, augmented with patient medical records. The resulting multi-modal deep-learning framework addresses interpretability and real-time diagnostic challenges that have hindered previous respiratory-focused models. Benchmark comparisons reveal that Rene significantly outperforms existing models, achieving improvements of 10.27%, 16.15%, 15.29%, and 18.90% in respiratory event detection and audio classification on the SPRSound database. Disease prediction accuracy on the ICBHI database improved by 23% over the baseline in both mean average and harmonic scores. Moreover, we have developed a real-time respiratory sound discrimination system utilizing the Rene architecture. Employing state-of-the-art Edge AI technology, this system enables rapid and accurate responses for respiratory sound auscultation(https://github.com/zpforlove/Rene).
Related papers
- Towards Open Respiratory Acoustic Foundation Models: Pretraining and Benchmarking [27.708473070563013]
Respiratory audio has predictive power for a wide range of healthcare applications, yet is currently under-explored.
We introduce OPERA, an OPEn Respiratory Acoustic foundation model pretraining and benchmarking system.
arXiv Detail & Related papers (2024-06-23T16:04:26Z) - RepAugment: Input-Agnostic Representation-Level Augmentation for Respiratory Sound Classification [2.812716452984433]
This paper explores the efficacy of pretrained speech models for respiratory sound classification.
We find that there is a characterization gap between speech and lung sound samples, and to bridge this gap, data augmentation is essential.
We propose RepAugment, an input-agnostic representation-level augmentation technique that outperforms SpecAugment.
arXiv Detail & Related papers (2024-05-05T16:45:46Z) - Adversarial Fine-tuning using Generated Respiratory Sound to Address
Class Imbalance [1.3686993145787067]
We propose a straightforward approach to augment imbalanced respiratory sound data using an audio diffusion model as a conditional neural vocoder.
We also demonstrate a simple yet effective adversarial fine-tuning method to align features between the synthetic and real respiratory sound samples to improve respiratory sound classification performance.
arXiv Detail & Related papers (2023-11-11T05:02:54Z) - SMRD: SURE-based Robust MRI Reconstruction with Diffusion Models [76.43625653814911]
Diffusion models have gained popularity for accelerated MRI reconstruction due to their high sample quality.
They can effectively serve as rich data priors while incorporating the forward model flexibly at inference time.
We introduce SURE-based MRI Reconstruction with Diffusion models (SMRD) to enhance robustness during testing.
arXiv Detail & Related papers (2023-10-03T05:05:35Z) - Using BOLD-fMRI to Compute the Respiration Volume per Time (RTV) and
Respiration Variation (RV) with Convolutional Neural Networks (CNN) in the
Human Connectome Development Cohort [55.41644538483948]
This study proposes a one-dimensional CNN model for reconstruction of two respiratory measures, RV and RVT.
Results show that a CNN can capture informative features from resting BOLD signals and reconstruct realistic RV and RVT timeseries.
arXiv Detail & Related papers (2023-07-03T18:06:36Z) - Patch-Mix Contrastive Learning with Audio Spectrogram Transformer on
Respiratory Sound Classification [19.180927437627282]
We introduce a novel and effective Patch-Mix Contrastive Learning to distinguish the mixed representations in the latent space.
Our method achieves state-of-the-art performance on the ICBHI dataset, outperforming the prior leading score by an improvement of 4.08%.
arXiv Detail & Related papers (2023-05-23T13:04:07Z) - Leveraging Pretrained Representations with Task-related Keywords for
Alzheimer's Disease Detection [69.53626024091076]
Alzheimer's disease (AD) is particularly prominent in older adults.
Recent advances in pre-trained models motivate AD detection modeling to shift from low-level features to high-level representations.
This paper presents several efficient methods to extract better AD-related cues from high-level acoustic and linguistic features.
arXiv Detail & Related papers (2023-03-14T16:03:28Z) - Investigation of Data Augmentation Techniques for Disordered Speech
Recognition [69.50670302435174]
This paper investigates a set of data augmentation techniques for disordered speech recognition.
Both normal and disordered speech were exploited in the augmentation process.
The final speaker adapted system constructed using the UASpeech corpus and the best augmentation approach based on speed perturbation produced up to 2.92% absolute word error rate (WER)
arXiv Detail & Related papers (2022-01-14T17:09:22Z) - Detecting COVID-19 from Breathing and Coughing Sounds using Deep Neural
Networks [68.8204255655161]
We adapt an ensemble of Convolutional Neural Networks to classify if a speaker is infected with COVID-19 or not.
Ultimately, it achieves an Unweighted Average Recall (UAR) of 74.9%, or an Area Under ROC Curve (AUC) of 80.7% by ensembling neural networks.
arXiv Detail & Related papers (2020-12-29T01:14:17Z) - Deep Learning for Automatic Pneumonia Detection [72.55423549641714]
Pneumonia is the leading cause of death among young children and one of the top mortality causes worldwide.
Computer-aided diagnosis systems showed the potential for improving diagnostic accuracy.
We develop the computational approach for pneumonia regions detection based on single-shot detectors, squeeze-and-excitation deep convolution neural networks, augmentations and multi-task learning.
arXiv Detail & Related papers (2020-05-28T10:54:34Z) - Robust Deep Learning Framework For Predicting Respiratory Anomalies and
Diseases [26.786743524562322]
This paper presents a robust deep learning framework developed to detect respiratory diseases from recordings of respiratory sounds.
A back-end deep learning model classifies the features into classes of respiratory disease or anomaly.
Experiments, conducted over the ICBHI benchmark dataset of respiratory sounds, evaluate the ability of the framework to classify sounds.
arXiv Detail & Related papers (2020-01-21T15:26:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.