Machine Learning based COVID-19 Detection from Smartphone Recordings:
Cough, Breath and Speech
- URL: http://arxiv.org/abs/2104.02477v1
- Date: Fri, 2 Apr 2021 23:21:24 GMT
- Title: Machine Learning based COVID-19 Detection from Smartphone Recordings:
Cough, Breath and Speech
- Authors: Madhurananda Pahar, Thomas Niesler
- Abstract summary: We present an experimental investigation into the automatic detection of COVID-19 from smartphone recordings of coughs, breaths and speech.
We base our experiments on two datasets, Coswara and ComParE, containing recordings of coughing, breathing and speech.
We conclude that among all vocal audio, coughs carry the strongest COVID-19 signature followed by breath and speech.
- Score: 7.908757488948712
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present an experimental investigation into the automatic detection of
COVID-19 from smartphone recordings of coughs, breaths and speech. This type of
screening is attractive because it is non-contact, does not require specialist
medical expertise or laboratory facilities and can easily be deployed on
inexpensive consumer hardware. We base our experiments on two datasets, Coswara
and ComParE, containing recordings of coughing, breathing and speech from
subjects around the globe. We have considered seven machine learning
classifiers and all of them are trained and evaluated using leave-p-out
cross-validation. For the Coswara data, the highest AUC of 0.92 was achieved
using a Resnet50 architecture on breaths. For the ComParE data, the highest AUC
of 0.93 was achieved using a k-nearest neighbours (KNN) classifier on cough
recordings after selecting the best 12 features using sequential forward
selection (SFS) and the highest AUC of 0.91 was also achieved on speech by a
multilayer perceptron (MLP) when using SFS to select the best 23 features. We
conclude that among all vocal audio, coughs carry the strongest COVID-19
signature followed by breath and speech. Although these signatures are not
perceivable by human ear, machine learning based COVID-19 detection is possible
from vocal audio recorded via smartphone.
Related papers
- EARS: An Anechoic Fullband Speech Dataset Benchmarked for Speech Enhancement and Dereverberation [83.29199726650899]
The EARS dataset comprises 107 speakers from diverse backgrounds, totaling in 100 hours of clean, anechoic speech data.
The dataset covers a large range of different speaking styles, including emotional speech, different reading styles, non-verbal sounds, and conversational freeform speech.
We benchmark various methods for speech enhancement and dereverberation on the dataset and evaluate their performance through a set of instrumental metrics.
arXiv Detail & Related papers (2024-06-10T11:28:29Z) - Fully Automated End-to-End Fake Audio Detection [57.78459588263812]
This paper proposes a fully automated end-toend fake audio detection method.
We first use wav2vec pre-trained model to obtain a high-level representation of the speech.
For the network structure, we use a modified version of the differentiable architecture search (DARTS) named light-DARTS.
arXiv Detail & Related papers (2022-08-20T06:46:55Z) - Deep Feature Learning for Medical Acoustics [78.56998585396421]
The purpose of this paper is to compare different learnables in medical acoustics tasks.
A framework has been implemented to classify human respiratory sounds and heartbeats in two categories, i.e. healthy or affected by pathologies.
arXiv Detail & Related papers (2022-08-05T10:39:37Z) - On the pragmatism of using binary classifiers over data intensive neural
network classifiers for detection of COVID-19 from voice [34.553128768223615]
We show that detecting COVID-19 from voice does not require custom-made non-standard features or complicated neural network classifiers.
We demonstrate this from a human-curated dataset collected and calibrated in clinical settings.
arXiv Detail & Related papers (2022-04-11T00:19:14Z) - CI-AVSR: A Cantonese Audio-Visual Speech Dataset for In-car Command
Recognition [91.33781557979819]
We introduce a new dataset, Cantonese In-car Audio-Visual Speech Recognition (CI-AVSR)
It consists of 4,984 samples (8.3 hours) of 200 in-car commands recorded by 30 native Cantonese speakers.
We provide detailed statistics of both the clean and the augmented versions of our dataset.
arXiv Detail & Related papers (2022-01-11T06:32:12Z) - Project Achoo: A Practical Model and Application for COVID-19 Detection
from Recordings of Breath, Voice, and Cough [55.45063681652457]
We propose a machine learning method to quickly triage COVID-19 using recordings made on consumer devices.
The approach combines signal processing methods with fine-tuned deep learning networks and provides methods for signal denoising, cough detection and classification.
We have also developed and deployed a mobile application that uses symptoms checker together with voice, breath and cough signals to detect COVID-19 infection.
arXiv Detail & Related papers (2021-07-12T08:07:56Z) - Detecting COVID-19 from Breathing and Coughing Sounds using Deep Neural
Networks [68.8204255655161]
We adapt an ensemble of Convolutional Neural Networks to classify if a speaker is infected with COVID-19 or not.
Ultimately, it achieves an Unweighted Average Recall (UAR) of 74.9%, or an Area Under ROC Curve (AUC) of 80.7% by ensembling neural networks.
arXiv Detail & Related papers (2020-12-29T01:14:17Z) - COVID-19 Cough Classification using Machine Learning and Global
Smartphone Recordings [6.441511459132334]
We present a machine learning based COVID-19 cough classifier which is able to discriminate COVID-19 positive coughs from both COVID-19 negative and healthy coughs recorded on a smartphone.
This type of screening is non-contact and easily applied, and could help reduce workload in testing centers as well as limit transmission.
arXiv Detail & Related papers (2020-12-02T13:35:42Z) - COVID-19 Patient Detection from Telephone Quality Speech Data [4.726777092009554]
We try to investigate the presence of cues about the COVID-19 disease in the speech data.
An SVM classifier on this dataset is able to achieve an accuracy of 88.6% and an F1-Score of 92.7%.
Some phone classes, such as nasals, stops, and mid vowels can distinguish the two classes better than the others.
arXiv Detail & Related papers (2020-11-09T10:16:08Z) - Exploring Automatic Diagnosis of COVID-19 from Crowdsourced Respiratory
Sound Data [20.318414518283618]
We describe our data analysis over a large-scale crowdsourced dataset of respiratory sounds collected to aid diagnosis of COVID-19.
Our results show that even a simple binary machine learning classifier is able to classify correctly healthy and COVID-19 sounds.
This work opens the door to further investigation of how automatically analysed respiratory patterns could be used as pre-screening signals.
arXiv Detail & Related papers (2020-06-10T16:13:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.