Abnormal Respiratory Sound Identification Using Audio-Spectrogram Vision Transformer
- URL: http://arxiv.org/abs/2405.08342v1
- Date: Tue, 14 May 2024 06:31:38 GMT
- Title: Abnormal Respiratory Sound Identification Using Audio-Spectrogram Vision Transformer
- Authors: Whenty Ariyanti, Kai-Chun Liu, Kuan-Yu Chen, Yu Tsao,
- Abstract summary: The AS-ViT method was evaluated using three metrics and achieved 79.1% and 59.8% for 60:40 split ratio and 86.4% and 69.3% for 80:20 split ratio.
The proposed AS-ViT method was evaluated using three metrics and achieved 79.1% and 59.8% for 60:40 split ratio and 86.4% and 69.3% for 80:20 split ratio.
- Score: 19.993594487490682
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Respiratory disease, the third leading cause of deaths globally, is considered a high-priority ailment requiring significant research on identification and treatment. Stethoscope-recorded lung sounds and artificial intelligence-powered devices have been used to identify lung disorders and aid specialists in making accurate diagnoses. In this study, audio-spectrogram vision transformer (AS-ViT), a new approach for identifying abnormal respiration sounds, was developed. The sounds of the lungs are converted into visual representations called spectrograms using a technique called short-time Fourier transform (STFT). These images are then analyzed using a model called vision transformer to identify different types of respiratory sounds. The classification was carried out using the ICBHI 2017 database, which includes various types of lung sounds with different frequencies, noise levels, and backgrounds. The proposed AS-ViT method was evaluated using three metrics and achieved 79.1% and 59.8% for 60:40 split ratio and 86.4% and 69.3% for 80:20 split ratio in terms of unweighted average recall and overall scores respectively for respiratory sound detection, surpassing previous state-of-the-art results.
Related papers
- Machine Learning-based Estimation of Respiratory Fluctuations in a Healthy Adult Population using BOLD fMRI and Head Motion Parameters [39.96015789655091]
In many fMRI studies, respiratory signals are often missing or of poor quality.
It could be highly beneficial to have a tool to extract respiratory variation (RV) waveforms directly from fMRI data without the need for peripheral recording devices.
This study proposes a CNN model for reconstruction of RV waveforms using head motion parameters and BOLD signals.
arXiv Detail & Related papers (2024-04-30T21:53:11Z) - Patch-Mix Contrastive Learning with Audio Spectrogram Transformer on
Respiratory Sound Classification [19.180927437627282]
We introduce a novel and effective Patch-Mix Contrastive Learning to distinguish the mixed representations in the latent space.
Our method achieves state-of-the-art performance on the ICBHI dataset, outperforming the prior leading score by an improvement of 4.08%.
arXiv Detail & Related papers (2023-05-23T13:04:07Z) - Transfer Learning Based Diagnosis and Analysis of Lung Sound Aberrations [0.35232085374661276]
This work attempts to develop a non-invasive technique for identifying respiratory sounds acquired by a stethoscope and voice recording software.
A visual representation of each audio sample is constructed, allowing resource identification for classification using methods like those used to effectively describe visuals.
Respiratory Sound Database obtained cutting-edge results, including accuracy of 95%, precision of 88%, recall score of 86%, and F1 score of 81%.
arXiv Detail & Related papers (2023-03-15T04:46:57Z) - Validated respiratory drug deposition predictions from 2D and 3D medical
images with statistical shape models and convolutional neural networks [47.187609203210705]
We aim to develop and validate an automated computational framework for patient-specific deposition modelling.
An image processing approach is proposed that could produce 3D patient respiratory geometries from 2D chest X-rays and 3D CT images.
arXiv Detail & Related papers (2023-03-02T07:47:07Z) - The role of noise in denoising models for anomaly detection in medical
images [62.0532151156057]
Pathological brain lesions exhibit diverse appearance in brain images.
Unsupervised anomaly detection approaches have been proposed using only normal data for training.
We show that optimization of the spatial resolution and magnitude of the noise improves the performance of different model training regimes.
arXiv Detail & Related papers (2023-01-19T21:39:38Z) - Preservation of High Frequency Content for Deep Learning-Based Medical
Image Classification [74.84221280249876]
An efficient analysis of large amounts of chest radiographs can aid physicians and radiologists.
We propose a novel Discrete Wavelet Transform (DWT)-based method for the efficient identification and encoding of visual information.
arXiv Detail & Related papers (2022-05-08T15:29:54Z) - Fused Audio Instance and Representation for Respiratory Disease
Detection [0.6827423171182154]
We propose Fused Audio Instance and Representation (FAIR) as a method for respiratory disease detection.
We conducted experiments on the use case of COVID-19 detection by combining waveform and spectrogram representation of body sounds.
arXiv Detail & Related papers (2022-04-22T09:01:29Z) - The Diagnosis of Asthma using Hilbert-Huang Transform and Deep Learning
on Lung Sounds [2.294014185517203]
The statistical features are calculated from intrinsic mode functions that are extracted by applying the Hilbert Transform to the lung sounds.
The classification of the lung sounds from asthma and healthy subjects is performed using Deep Belief Networks (DBN)
arXiv Detail & Related papers (2021-01-20T19:04:33Z) - Inception-Based Network and Multi-Spectrogram Ensemble Applied For
Predicting Respiratory Anomalies and Lung Diseases [16.318395700171624]
This paper presents an inception-based deep neural network for detecting lung diseases using respiratory sound input.
Recordings of respiratory sound collected from patients are transformed into spectrograms where both spectral and temporal information are well presented.
These spectrograms are fed into the proposed network, referred to as back-end classification, for detecting whether patients suffer from lung-relevant diseases.
arXiv Detail & Related papers (2020-12-26T08:25:02Z) - Identification of deep breath while moving forward based on multiple
body regions and graph signal analysis [45.62293065676075]
This paper presents an unobtrusive solution that can automatically identify deep breath when a person is walking past the global depth camera.
In validation experiments, the proposed approach outperforms the comparative methods with the accuracy, precision, recall and F1 of 75.5%, 76.2%, 75.0% and 75.2%, respectively.
arXiv Detail & Related papers (2020-10-20T08:26:50Z) - Machine-Learning-Based Multiple Abnormality Prediction with Large-Scale
Chest Computed Tomography Volumes [64.21642241351857]
We curated and analyzed a chest computed tomography (CT) data set of 36,316 volumes from 19,993 unique patients.
We developed a rule-based method for automatically extracting abnormality labels from free-text radiology reports.
We also developed a model for multi-organ, multi-disease classification of chest CT volumes.
arXiv Detail & Related papers (2020-02-12T00:59:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.