Interpreting glottal flow dynamics for detecting COVID-19 from voice
- URL: http://arxiv.org/abs/2010.16318v1
- Date: Thu, 29 Oct 2020 13:16:57 GMT
- Title: Interpreting glottal flow dynamics for detecting COVID-19 from voice
- Authors: Soham Deshmukh, Mahmoud Al Ismail, Rita Singh
- Abstract summary: This paper proposes a method that analyzes the differential dynamics of the glottal flow waveform (GFW) during voice production.
We infer it from recorded speech signals and compare it to the GFW computed from physical model of phonation.
Our proposed method uses a CNN-based 2-step attention model that locates anomalies in time-feature space in the difference of the two GFWs.
- Score: 18.387162887917164
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the pathogenesis of COVID-19, impairment of respiratory functions is often
one of the key symptoms. Studies show that in these cases, voice production is
also adversely affected -- vocal fold oscillations are asynchronous,
asymmetrical and more restricted during phonation. This paper proposes a method
that analyzes the differential dynamics of the glottal flow waveform (GFW)
during voice production to identify features in them that are most significant
for the detection of COVID-19 from voice. Since it is hard to measure this
directly in COVID-19 patients, we infer it from recorded speech signals and
compare it to the GFW computed from physical model of phonation. For normal
voices, the difference between the two should be minimal, since physical models
are constructed to explain phonation under assumptions of normalcy. Greater
differences implicate anomalies in the bio-physical factors that contribute to
the correctness of the physical model, revealing their significance indirectly.
Our proposed method uses a CNN-based 2-step attention model that locates
anomalies in time-feature space in the difference of the two GFWs, allowing us
to infer their potential as discriminative features for classification. The
viability of this method is demonstrated using a clinically curated dataset of
COVID-19 positive and negative subjects.
Related papers
- Show from Tell: Audio-Visual Modelling in Clinical Settings [58.88175583465277]
We consider audio-visual modelling in a clinical setting, providing a solution to learn medical representations without human expert annotation.
A simple yet effective multi-modal self-supervised learning framework is proposed for this purpose.
The proposed approach is able to localise anatomical regions of interest during ultrasound imaging, with only speech audio as a reference.
arXiv Detail & Related papers (2023-10-25T08:55:48Z) - Instrumental Variable Learning for Chest X-ray Classification [52.68170685918908]
We propose an interpretable instrumental variable (IV) learning framework to eliminate the spurious association and obtain accurate causal representation.
Our approach's performance is demonstrated using the MIMIC-CXR, NIH ChestX-ray 14, and CheXpert datasets.
arXiv Detail & Related papers (2023-05-20T03:12:23Z) - The role of noise in denoising models for anomaly detection in medical
images [62.0532151156057]
Pathological brain lesions exhibit diverse appearance in brain images.
Unsupervised anomaly detection approaches have been proposed using only normal data for training.
We show that optimization of the spatial resolution and magnitude of the noise improves the performance of different model training regimes.
arXiv Detail & Related papers (2023-01-19T21:39:38Z) - Analyzing the impact of SARS-CoV-2 variants on respiratory sound signals [23.789227109218118]
We explore whether acoustic signals, collected from COVID-19 subjects, show computationally distinguishable acoustic patterns.
Our findings suggest that multiple sound categories, such as cough, breathing, and speech, indicate significant acoustic feature differences when comparing COVID-19 subjects with omicron and delta variants.
arXiv Detail & Related papers (2022-06-24T14:10:31Z) - COVYT: Introducing the Coronavirus YouTube and TikTok speech dataset
featuring the same speakers with and without infection [4.894353840908006]
We introduce the COVYT dataset -- a novel COVID-19 dataset collected from public sources containing more than 8 hours of speech from 65 speakers.
As compared to other existing COVID-19 sound datasets, the unique feature of the COVYT dataset is that it comprises both COVID-19 positive and negative samples from all 65 speakers.
arXiv Detail & Related papers (2022-06-20T16:26:51Z) - Exploiting Cross Domain Acoustic-to-articulatory Inverted Features For
Disordered Speech Recognition [57.15942628305797]
Articulatory features are invariant to acoustic signal distortion and have been successfully incorporated into automatic speech recognition systems for normal speech.
This paper presents a cross-domain acoustic-to-articulatory (A2A) inversion approach that utilizes the parallel acoustic-articulatory data of the 15-hour TORGO corpus in model training.
Cross-domain adapted to the 102.7-hour UASpeech corpus and to produce articulatory features.
arXiv Detail & Related papers (2022-03-19T08:47:18Z) - Continuous Speech for Improved Learning Pathological Voice Disorders [12.867900671251395]
This study proposes a novel approach, using continuous Mandarin speech instead of a single vowel, to classify four common voice disorders.
In the proposed framework, acoustic signals are transformed into mel-frequency cepstral coefficients, and a bi-directional long-short term memory network (BiLSTM) is adopted to model the sequential features.
arXiv Detail & Related papers (2022-02-22T09:58:31Z) - A Preliminary Study of a Two-Stage Paradigm for Preserving Speaker
Identity in Dysarthric Voice Conversion [50.040466658605524]
We propose a new paradigm for maintaining speaker identity in dysarthric voice conversion (DVC)
The poor quality of dysarthric speech can be greatly improved by statistical VC.
But as the normal speech utterances of a dysarthria patient are nearly impossible to collect, previous work failed to recover the individuality of the patient.
arXiv Detail & Related papers (2021-06-02T18:41:03Z) - The voice of COVID-19: Acoustic correlates of infection [9.7390888107204]
COVID-19 is a global health crisis that has been affecting many aspects of our daily lives throughout the past year.
We compare acoustic features extracted from recordings of the vowels /i:/, /e:/, /o:/, /u:/, and /a:/ produced by 11 symptomatic COVID-19 positive and 11 COVID-19 negative German-speaking participants.
arXiv Detail & Related papers (2020-12-17T10:12:41Z) - Detection of COVID-19 through the analysis of vocal fold oscillations [18.387162887917164]
Phonation, or the vibration of the vocal folds, is the primary source of vocalization in the production of voiced sounds by humans.
Since most symptomatic cases of COVID-19 present with moderate to severe impairment of respiratory functions, we hypothesize that signatures of COVID-19 may be observable by examining the vibrations of the vocal folds.
Our goal is to validate this hypothesis, and to quantitatively characterize the changes observed to enable the detection of COVID-19 from voice.
arXiv Detail & Related papers (2020-10-21T01:44:42Z) - COVID-DA: Deep Domain Adaptation from Typical Pneumonia to COVID-19 [92.4955073477381]
The outbreak of novel coronavirus disease 2019 (COVID-19) has already infected millions of people and is still rapidly spreading all over the globe.
Deep learning has been used recently as effective computer-aided means to improve diagnostic efficiency.
We propose a new deep domain adaptation method for COVID-19 diagnosis, namely COVID-DA.
arXiv Detail & Related papers (2020-04-30T03:13:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.