Adversarial Fine-tuning using Generated Respiratory Sound to Address
Class Imbalance
- URL: http://arxiv.org/abs/2311.06480v1
- Date: Sat, 11 Nov 2023 05:02:54 GMT
- Title: Adversarial Fine-tuning using Generated Respiratory Sound to Address
Class Imbalance
- Authors: June-Woo Kim, Chihyeon Yoon, Miika Toikkanen, Sangmin Bae, Ho-Young
Jung
- Abstract summary: We propose a straightforward approach to augment imbalanced respiratory sound data using an audio diffusion model as a conditional neural vocoder.
We also demonstrate a simple yet effective adversarial fine-tuning method to align features between the synthetic and real respiratory sound samples to improve respiratory sound classification performance.
- Score: 1.3686993145787067
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep generative models have emerged as a promising approach in the medical
image domain to address data scarcity. However, their use for sequential data
like respiratory sounds is less explored. In this work, we propose a
straightforward approach to augment imbalanced respiratory sound data using an
audio diffusion model as a conditional neural vocoder. We also demonstrate a
simple yet effective adversarial fine-tuning method to align features between
the synthetic and real respiratory sound samples to improve respiratory sound
classification performance. Our experimental results on the ICBHI dataset
demonstrate that the proposed adversarial fine-tuning is effective, while only
using the conventional augmentation method shows performance degradation.
Moreover, our method outperforms the baseline by 2.24% on the ICBHI Score and
improves the accuracy of the minority classes up to 26.58%. For the
supplementary material, we provide the code at
https://github.com/kaen2891/adversarial_fine-tuning_using_generated_respiratory_sound.
Related papers
- Classifier Guidance Enhances Diffusion-based Adversarial Purification by Preserving Predictive Information [75.36597470578724]
Adversarial purification is one of the promising approaches to defend neural networks against adversarial attacks.
We propose gUided Purification (COUP) algorithm, which purifies while keeping away from the classifier decision boundary.
Experimental results show that COUP can achieve better adversarial robustness under strong attack methods.
arXiv Detail & Related papers (2024-08-12T02:48:00Z) - BTS: Bridging Text and Sound Modalities for Metadata-Aided Respiratory Sound Classification [0.0]
We fine-tune a pretrained text-audio multimodal model using free-text descriptions derived from the sound samples' metadata.
Our method achieves state-of-the-art performance on the ICBHI dataset, surpassing the previous best result by a notable margin of 1.17%.
arXiv Detail & Related papers (2024-06-10T20:49:54Z) - Rene: A Pre-trained Multi-modal Architecture for Auscultation of Respiratory Diseases [5.810320353233697]
We introduce Rene, a pioneering large-scale model tailored for respiratory sound recognition.
Our innovative approach applies a pre-trained speech recognition model to process respiratory sounds.
We have developed a real-time respiratory sound discrimination system utilizing the Rene architecture.
arXiv Detail & Related papers (2024-05-13T03:00:28Z) - RepAugment: Input-Agnostic Representation-Level Augmentation for Respiratory Sound Classification [2.812716452984433]
This paper explores the efficacy of pretrained speech models for respiratory sound classification.
We find that there is a characterization gap between speech and lung sound samples, and to bridge this gap, data augmentation is essential.
We propose RepAugment, an input-agnostic representation-level augmentation technique that outperforms SpecAugment.
arXiv Detail & Related papers (2024-05-05T16:45:46Z) - Stethoscope-guided Supervised Contrastive Learning for Cross-domain
Adaptation on Respiratory Sound Classification [1.690115983364313]
We introduce cross-domain adaptation techniques, which transfer the knowledge from a source domain to a distinct target domain.
In particular, by considering different stethoscope types as individual domains, we propose a novel stethoscope-guided supervised contrastive learning approach.
The experimental results on the ICBHI dataset demonstrate that the proposed methods are effective in reducing the domain dependency and achieving the ICBHI Score of 61.71%, which is a significant improvement of 2.16% over the baseline.
arXiv Detail & Related papers (2023-12-15T08:34:31Z) - Boosting Fast and High-Quality Speech Synthesis with Linear Diffusion [85.54515118077825]
This paper proposes a linear diffusion model (LinDiff) based on an ordinary differential equation to simultaneously reach fast inference and high sample quality.
To reduce computational complexity, LinDiff employs a patch-based processing approach that partitions the input signal into small patches.
Our model can synthesize speech of a quality comparable to that of autoregressive models with faster synthesis speed.
arXiv Detail & Related papers (2023-06-09T07:02:43Z) - Patch-Mix Contrastive Learning with Audio Spectrogram Transformer on
Respiratory Sound Classification [19.180927437627282]
We introduce a novel and effective Patch-Mix Contrastive Learning to distinguish the mixed representations in the latent space.
Our method achieves state-of-the-art performance on the ICBHI dataset, outperforming the prior leading score by an improvement of 4.08%.
arXiv Detail & Related papers (2023-05-23T13:04:07Z) - The role of noise in denoising models for anomaly detection in medical
images [62.0532151156057]
Pathological brain lesions exhibit diverse appearance in brain images.
Unsupervised anomaly detection approaches have been proposed using only normal data for training.
We show that optimization of the spatial resolution and magnitude of the noise improves the performance of different model training regimes.
arXiv Detail & Related papers (2023-01-19T21:39:38Z) - Deep Feature Learning for Medical Acoustics [78.56998585396421]
The purpose of this paper is to compare different learnables in medical acoustics tasks.
A framework has been implemented to classify human respiratory sounds and heartbeats in two categories, i.e. healthy or affected by pathologies.
arXiv Detail & Related papers (2022-08-05T10:39:37Z) - Detecting COVID-19 from Breathing and Coughing Sounds using Deep Neural
Networks [68.8204255655161]
We adapt an ensemble of Convolutional Neural Networks to classify if a speaker is infected with COVID-19 or not.
Ultimately, it achieves an Unweighted Average Recall (UAR) of 74.9%, or an Area Under ROC Curve (AUC) of 80.7% by ensembling neural networks.
arXiv Detail & Related papers (2020-12-29T01:14:17Z) - Rectified Meta-Learning from Noisy Labels for Robust Image-based Plant
Disease Diagnosis [64.82680813427054]
Plant diseases serve as one of main threats to food security and crop production.
One popular approach is to transform this problem as a leaf image classification task, which can be addressed by the powerful convolutional neural networks (CNNs)
We propose a novel framework that incorporates rectified meta-learning module into common CNN paradigm to train a noise-robust deep network without using extra supervision information.
arXiv Detail & Related papers (2020-03-17T09:51:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.