Description on IEEE ICME 2024 Grand Challenge: Semi-supervised Acoustic
Scene Classification under Domain Shift
- URL: http://arxiv.org/abs/2402.02694v2
- Date: Thu, 29 Feb 2024 02:35:54 GMT
- Title: Description on IEEE ICME 2024 Grand Challenge: Semi-supervised Acoustic
Scene Classification under Domain Shift
- Authors: Jisheng Bai, Mou Wang, Haohe Liu, Han Yin, Yafei Jia, Siwei Huang,
Yutong Du, Dongzhe Zhang, Dongyuan Shi, Woon-Seng Gan, Mark D. Plumbley,
Susanto Rahardja, Bin Xiang, Jianfeng Chen
- Abstract summary: Acoustic scene classification (ASC) is a crucial research problem in computational auditory scene analysis.
One of the challenges of the ASC task is the domain shift between training and testing data.
We introduce the task Semi-supervised Acoustic Scene Classification under Domain Shift in the ICME 2024 Grand Challenge.
- Score: 28.483681147793302
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Acoustic scene classification (ASC) is a crucial research problem in
computational auditory scene analysis, and it aims to recognize the unique
acoustic characteristics of an environment. One of the challenges of the ASC
task is the domain shift between training and testing data. Since 2018, ASC
challenges have focused on the generalization of ASC models across different
recording devices. Although this task, in recent years, has achieved
substantial progress in device generalization, the challenge of domain shift
between different geographical regions, involving discrepancies such as time,
space, culture, and language, remains insufficiently explored at present. In
addition, considering the abundance of unlabeled acoustic scene data in the
real world, it is important to study the possible ways to utilize these
unlabelled data. Therefore, we introduce the task Semi-supervised Acoustic
Scene Classification under Domain Shift in the ICME 2024 Grand Challenge. We
encourage participants to innovate with semi-supervised learning techniques,
aiming to develop more robust ASC models under domain shift.
Related papers
- Frequency-based Matcher for Long-tailed Semantic Segmentation [22.199174076366003]
We focus on a relatively under-explored task setting, long-tailed semantic segmentation (LTSS)
We propose a dual-metric evaluation system and construct the LTSS benchmark to demonstrate the performance of semantic segmentation methods and long-tailed solutions.
We also propose a transformer-based algorithm to improve LTSS, frequency-based matcher, which solves the oversuppression problem by one-to-many matching.
arXiv Detail & Related papers (2024-06-06T09:57:56Z) - Multi-Label Continual Learning for the Medical Domain: A Novel Benchmark [47.52603262576663]
We propose a novel benchmark combining the challenges of new class arrivals and domain shifts in a single framework.
This benchmark aims to model a realistic CL setting for the multi-label classification problem in medical imaging.
arXiv Detail & Related papers (2024-04-10T09:35:36Z) - FRCSyn Challenge at WACV 2024:Face Recognition Challenge in the Era of
Synthetic Data [82.5767720132393]
This paper offers an overview of the Face Recognition Challenge in the Era of Synthetic Data (FRCSyn) organized at WACV 2024.
This is the first international challenge aiming to explore the use of synthetic data in face recognition to address existing limitations in the technology.
arXiv Detail & Related papers (2023-11-17T12:15:40Z) - The Robust Semantic Segmentation UNCV2023 Challenge Results [99.97867942388486]
This paper outlines the winning solutions employed in addressing the MUAD uncertainty quantification challenge held at ICCV 2023.
The challenge was centered around semantic segmentation in urban environments, with a particular focus on natural adversarial scenarios.
The report presents the results of 19 submitted entries, with numerous techniques drawing inspiration from cutting-edge uncertainty quantification methodologies.
arXiv Detail & Related papers (2023-09-27T08:20:03Z) - Robust, General, and Low Complexity Acoustic Scene Classification
Systems and An Effective Visualization for Presenting a Sound Scene Context [53.80051967863102]
We present a comprehensive analysis of Acoustic Scene Classification (ASC)
We propose an inception-based and low footprint ASC model, referred to as the ASC baseline.
Next, we improve the ASC baseline by proposing a novel deep neural network architecture.
arXiv Detail & Related papers (2022-10-16T19:07:21Z) - Few-shot bioacoustic event detection at the DCASE 2022 challenge [0.0]
Few-shot sound event detection is the task of detecting sound events despite having only a few labelled examples.
This paper presents an overview of the second edition of the few-shot bioacoustic sound event detection task included in the DCASE 2022 challenge.
The highest F-score was of 60% on the evaluation set, which leads to a huge improvement over last year's edition.
arXiv Detail & Related papers (2022-07-14T09:33:47Z) - Wider or Deeper Neural Network Architecture for Acoustic Scene
Classification with Mismatched Recording Devices [59.86658316440461]
We present a robust and low complexity system for Acoustic Scene Classification (ASC)
We first construct an ASC baseline system in which a novel inception-residual-based network architecture is proposed to deal with the mismatched recording device issue.
To further improve the performance but still satisfy the low complexity model, we apply two techniques: ensemble of multiple spectrograms and channel reduction.
arXiv Detail & Related papers (2022-03-23T10:27:41Z) - Description and Discussion on DCASE 2021 Challenge Task 2: Unsupervised
Anomalous Sound Detection for Machine Condition Monitoring under Domain
Shifted Conditions [37.68195595947483]
This task focuses on the inevitable problem for the practical use of ASD systems.
The main challenge of this task is to detect unknown anomalous sounds where the acoustic characteristics of the training and testing samples are different.
arXiv Detail & Related papers (2021-06-08T16:26:10Z) - Cross-domain Adaptation with Discrepancy Minimization for
Text-independent Forensic Speaker Verification [61.54074498090374]
This study introduces a CRSS-Forensics audio dataset collected in multiple acoustic environments.
We pre-train a CNN-based network using the VoxCeleb data, followed by an approach which fine-tunes part of the high-level network layers with clean speech from CRSS-Forensics.
arXiv Detail & Related papers (2020-09-05T02:54:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.