Fused Audio Instance and Representation for Respiratory Disease
Detection
- URL: http://arxiv.org/abs/2204.10581v4
- Date: Thu, 23 Nov 2023 09:15:48 GMT
- Title: Fused Audio Instance and Representation for Respiratory Disease
Detection
- Authors: Tuan Truong, Matthias Lenga, Antoine Serrurier, Sadegh Mohammadi
- Abstract summary: We propose Fused Audio Instance and Representation (FAIR) as a method for respiratory disease detection.
We conducted experiments on the use case of COVID-19 detection by combining waveform and spectrogram representation of body sounds.
- Score: 0.6827423171182154
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Audio-based classification techniques on body sounds have long been studied
to aid in the diagnosis of respiratory diseases. While most research is
centered on the use of cough as the main biomarker, other body sounds also have
the potential to detect respiratory diseases. Recent studies on COVID-19 have
shown that breath and speech sounds, in addition to cough, correlate with the
disease. Our study proposes Fused Audio Instance and Representation (FAIR) as a
method for respiratory disease detection. FAIR relies on constructing a joint
feature vector from various body sounds represented in waveform and spectrogram
form. We conducted experiments on the use case of COVID-19 detection by
combining waveform and spectrogram representation of body sounds. Our findings
show that the use of self-attention to combine extracted features from cough,
breath, and speech sounds leads to the best performance with an Area Under the
Receiver Operating Characteristic Curve (AUC) score of 0.8658, a sensitivity of
0.8057, and a specificity of 0.7958. Compared to models trained solely on
spectrograms or waveforms, the use of both representations results in an
improved AUC score, demonstrating that combining spectrogram and waveform
representation helps to enrich the extracted features and outperforms the
models that use only one representation.
Related papers
- Abnormal Respiratory Sound Identification Using Audio-Spectrogram Vision Transformer [19.993594487490682]
The AS-ViT method was evaluated using three metrics and achieved 79.1% and 59.8% for 60:40 split ratio and 86.4% and 69.3% for 80:20 split ratio.
The proposed AS-ViT method was evaluated using three metrics and achieved 79.1% and 59.8% for 60:40 split ratio and 86.4% and 69.3% for 80:20 split ratio.
arXiv Detail & Related papers (2024-05-14T06:31:38Z) - A Federated Learning Framework for Stenosis Detection [70.27581181445329]
This study explores the use of Federated Learning (FL) for stenosis detection in coronary angiography images (CA)
Two heterogeneous datasets from two institutions were considered: dataset 1 includes 1219 images from 200 patients, which we acquired at the Ospedale Riuniti of Ancona (Italy)
dataset 2 includes 7492 sequential images from 90 patients from a previous study available in the literature.
arXiv Detail & Related papers (2023-10-30T11:13:40Z) - Show from Tell: Audio-Visual Modelling in Clinical Settings [58.88175583465277]
We consider audio-visual modelling in a clinical setting, providing a solution to learn medical representations without human expert annotation.
A simple yet effective multi-modal self-supervised learning framework is proposed for this purpose.
The proposed approach is able to localise anatomical regions of interest during ultrasound imaging, with only speech audio as a reference.
arXiv Detail & Related papers (2023-10-25T08:55:48Z) - COVID-19 Detection System: A Comparative Analysis of System Performance Based on Acoustic Features of Cough Audio Signals [0.6963971634605796]
This research aims to explore various acoustic features that enhance the performance of machine learning (ML) models in detecting COVID-19 from cough signals.
It investigates the efficacy of three feature extraction techniques, including Mel Frequency Cepstral Coefficients (MFCC), Chroma, and Spectral Contrast features, when applied to two machine learning algorithms, Support Vector Machine (SVM) and Multilayer Perceptron (MLP)
The proposed system provides a practical solution and demonstrates state-of-the-art classification performance, with an AUC of 0.843 on the COUGHVID dataset and 0.953 on the Virufy
arXiv Detail & Related papers (2023-09-08T08:33:24Z) - Attention-based Saliency Maps Improve Interpretability of Pneumothorax
Classification [52.77024349608834]
To investigate chest radiograph (CXR) classification performance of vision transformers (ViT) and interpretability of attention-based saliency.
ViTs were fine-tuned for lung disease classification using four public data sets: CheXpert, Chest X-Ray 14, MIMIC CXR, and VinBigData.
ViTs had comparable CXR classification AUCs compared with state-of-the-art CNNs.
arXiv Detail & Related papers (2023-03-03T12:05:41Z) - Audio Deepfake Detection Based on a Combination of F0 Information and
Real Plus Imaginary Spectrogram Features [51.924340387119415]
Experimental results on the ASVspoof 2019 LA dataset show that our proposed system is very effective for the audio deepfake detection task.
Our proposed system is very effective for the audio deepfake detection task, achieving an equivalent error rate (EER) of 0.43%, which surpasses almost all systems.
arXiv Detail & Related papers (2022-08-02T02:46:16Z) - COVID-19 Detection from Respiratory Sounds with Hierarchical Spectrogram
Transformers [1.4091863292043447]
We introduce a novel deep learning approach to distinguish patients with COVID-19 from healthy controls given audio recordings of cough or breathing sounds.
The proposed approach leverages a novel hierarchical spectrogram transformer (HST) on spectrogram representations of respiratory sounds.
HST embodies self-attention mechanisms over local windows in spectrograms, and window size is progressively grown over model stages to capture local to global context.
arXiv Detail & Related papers (2022-07-19T19:55:16Z) - Cough Detection Using Selected Informative Features from Audio Signals [24.829135966052142]
The models are trained by the dataset combined ESC-50 dataset with self-recorded cough recordings.
The best cough detection model realizes the accuracy, recall, precision and F1-score with 94.9%, 97.1%, 93.1% and 0.95 respectively.
arXiv Detail & Related papers (2021-08-07T23:05:18Z) - Quantification of pulmonary involvement in COVID-19 pneumonia by means
of a cascade oftwo U-nets: training and assessment on multipledatasets using
different annotation criteria [83.83783947027392]
This study aims at exploiting Artificial intelligence (AI) for the identification, segmentation and quantification of COVID-19 pulmonary lesions.
We developed an automated analysis pipeline, the LungQuant system, based on a cascade of two U-nets.
The accuracy in predicting the CT-Severity Score (CT-SS) of the LungQuant system has been also evaluated.
arXiv Detail & Related papers (2021-05-06T10:21:28Z) - Detecting COVID-19 from Breathing and Coughing Sounds using Deep Neural
Networks [68.8204255655161]
We adapt an ensemble of Convolutional Neural Networks to classify if a speaker is infected with COVID-19 or not.
Ultimately, it achieves an Unweighted Average Recall (UAR) of 74.9%, or an Area Under ROC Curve (AUC) of 80.7% by ensembling neural networks.
arXiv Detail & Related papers (2020-12-29T01:14:17Z) - CNN-MoE based framework for classification of respiratory anomalies and
lung disease detection [33.45087488971683]
This paper presents and explores a robust deep learning framework for auscultation analysis.
It aims to classify anomalies in respiratory cycles and detect disease, from respiratory sound recordings.
arXiv Detail & Related papers (2020-04-04T21:45:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.