The INTERSPEECH 2021 Computational Paralinguistics Challenge: COVID-19
Cough, COVID-19 Speech, Escalation & Primates
- URL: http://arxiv.org/abs/2102.13468v1
- Date: Wed, 24 Feb 2021 21:39:59 GMT
- Title: The INTERSPEECH 2021 Computational Paralinguistics Challenge: COVID-19
Cough, COVID-19 Speech, Escalation & Primates
- Authors: Bj\"orn W. Schuller, Anton Batliner, Christian Bergler, Cecilia
Mascolo, Jing Han, Iulia Lefter, Heysem Kaya, Shahin Amiriparian, Alice
Baird, Lukas Stappen, Sandra Ottl, Maurice Gerczuk, Panagiotis Tzirakis,
Chlo\"e Brown, Jagmohan Chauhan, Andreas Grammenos, Apinan Hasthanasombat,
Dimitris Spathis, Tong Xia, Pietro Cicuta, Leon J. M. Rothkrantz, Joeri
Zwerts, Jelle Treep, Casper Kaandorp
- Abstract summary: The INTERSPEECH 2021 Computational Paralinguistics Challenge addresses four different problems for the first time.
In the COVID-19 Cough and COVID-19 Speech Sub-Challenges, a binary classification on COVID-19 infection has to be made based on coughing sounds and speech.
In the Escalation SubChallenge, a three-way assessment of the level of escalation in a dialogue is featured.
- Score: 34.39118619224786
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The INTERSPEECH 2021 Computational Paralinguistics Challenge addresses four
different problems for the first time in a research competition under
well-defined conditions: In the COVID-19 Cough and COVID-19 Speech
Sub-Challenges, a binary classification on COVID-19 infection has to be made
based on coughing sounds and speech; in the Escalation SubChallenge, a
three-way assessment of the level of escalation in a dialogue is featured; and
in the Primates Sub-Challenge, four species vs background need to be
classified. We describe the Sub-Challenges, baseline feature extraction, and
classifiers based on the 'usual' COMPARE and BoAW features as well as deep
unsupervised representation learning using the AuDeep toolkit, and deep feature
extraction from pre-trained CNNs using the Deep Spectrum toolkit; in addition,
we add deep end-to-end sequential modelling, and partially linguistic analysis.
Related papers
- System Description for the Displace Speaker Diarization Challenge 2023 [0.0]
This paper describes our solution for the Diarization of Speaker and Language in Conversational Environments Challenge (Displace 2023)
We used a combination of VAD for finding segfments with speech, Resnet architecture based CNN for feature extraction from these segments, and spectral clustering for features clustering.
arXiv Detail & Related papers (2024-06-20T21:40:02Z) - AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension [95.8442896569132]
We introduce AIR-Bench, the first benchmark to evaluate the ability of Large Audio-Language Models (LALMs) to understand various types of audio signals and interact with humans in the textual format.
Results demonstrate a high level of consistency between GPT-4-based evaluation and human evaluation.
arXiv Detail & Related papers (2024-02-12T15:41:22Z) - The ACM Multimedia 2023 Computational Paralinguistics Challenge: Emotion
Share & Requests [66.24715220997547]
The ACM Multimedia 2023 Paralinguistics Challenge addresses two different problems for the first time under well-defined conditions.
In the Emotion Share Sub-Challenge, a regression on speech has to be made; and in the Requests Sub-Challenges, requests and complaints need to be detected.
We describe the Sub-Challenges, baseline feature extraction, and classifiers based on the usual ComPaRE features, the auDeep toolkit, and deep feature extraction from pre-trained CNNs using the DeepSpectRum toolkit.
arXiv Detail & Related papers (2023-04-28T14:42:55Z) - Low-complexity deep learning frameworks for acoustic scene
classification [64.22762153453175]
We present low-complexity deep learning frameworks for acoustic scene classification (ASC)
The proposed frameworks can be separated into four main steps: Front-end spectrogram extraction, online data augmentation, back-end classification, and late fusion of predicted probabilities.
Our experiments conducted on DCASE 2022 Task 1 Development dataset have fullfiled the requirement of low-complexity and achieved the best classification accuracy of 60.1%.
arXiv Detail & Related papers (2022-06-13T11:41:39Z) - The ACM Multimedia 2022 Computational Paralinguistics Challenge:
Vocalisations, Stuttering, Activity, & Mosquitoes [9.09787422797708]
ACM Multimedia 2022 Computational Paralinguistics Challenge addresses four different problems.
In the Vocalisations and Stuttering Sub-Challenges, a classification on human non-verbal vocalisations and speech has to be made.
The Activity Sub-Challenge aims at beyond-audio human activity recognition from smartwatch sensor data.
In the Mosquitoes Sub-Challenge, mosquitoes need to be detected.
arXiv Detail & Related papers (2022-05-13T17:51:45Z) - Evaluating the COVID-19 Identification ResNet (CIdeR) on the INTERSPEECH
COVID-19 from Audio Challenges [59.78485839636553]
CIdeR is an end-to-end deep learning neural network originally designed to classify whether an individual is COVID-positive or COVID-negative.
We demonstrate the potential of CIdeR at binary COVID-19 diagnosis from both the COVID-19 Cough and Speech Sub-Challenges of INTERSPEECH 2021, ComParE and DiCOVA.
arXiv Detail & Related papers (2021-07-30T10:59:08Z) - End-2-End COVID-19 Detection from Breath & Cough Audio [68.41471917650571]
We demonstrate the first attempt to diagnose COVID-19 using end-to-end deep learning from a crowd-sourced dataset of audio samples.
We introduce a novel modelling strategy using a custom deep neural network to diagnose COVID-19 from a joint breath and cough representation.
arXiv Detail & Related papers (2021-01-07T01:13:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.