Robust, General, and Low Complexity Acoustic Scene Classification
Systems and An Effective Visualization for Presenting a Sound Scene Context
- URL: http://arxiv.org/abs/2210.08610v1
- Date: Sun, 16 Oct 2022 19:07:21 GMT
- Title: Robust, General, and Low Complexity Acoustic Scene Classification
Systems and An Effective Visualization for Presenting a Sound Scene Context
- Authors: Lam Pham, Dusan Salovic, Anahid Jalali, Alexander Schindler, Khoa
Tran, Canh Vu, Phu X. Nguyen
- Abstract summary: We present a comprehensive analysis of Acoustic Scene Classification (ASC)
We propose an inception-based and low footprint ASC model, referred to as the ASC baseline.
Next, we improve the ASC baseline by proposing a novel deep neural network architecture.
- Score: 53.80051967863102
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we present a comprehensive analysis of Acoustic Scene
Classification (ASC), the task of identifying the scene of an audio recording
from its acoustic signature. In particular, we firstly propose an
inception-based and low footprint ASC model, referred to as the ASC baseline.
The proposed ASC baseline is then compared with benchmark and high-complexity
network architectures of MobileNetV1, MobileNetV2, VGG16, VGG19, ResNet50V2,
ResNet152V2, DenseNet121, DenseNet201, and Xception. Next, we improve the ASC
baseline by proposing a novel deep neural network architecture which leverages
residual-inception architectures and multiple kernels. Given the novel
residual-inception (NRI) model, we further evaluate the trade off between the
model complexity and the model accuracy performance. Finally, we evaluate
whether sound events occurring in a sound scene recording can help to improve
ASC accuracy, then indicate how a sound scene context is well presented by
combining both sound scene and sound event information. We conduct extensive
experiments on various ASC datasets, including Crowded Scenes, IEEE AASP
Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE)
2018 Task 1A and 1B, 2019 Task 1A and 1B, 2020 Task 1A, 2021 Task 1A, 2022 Task
1. The experimental results on several different ASC challenges highlight two
main achievements; the first is to propose robust, general, and low complexity
ASC systems which are suitable for real-life applications on a wide range of
edge devices and mobiles; the second is to propose an effective visualization
method for comprehensively presenting a sound scene context.
Related papers
- Description on IEEE ICME 2024 Grand Challenge: Semi-supervised Acoustic
Scene Classification under Domain Shift [28.483681147793302]
Acoustic scene classification (ASC) is a crucial research problem in computational auditory scene analysis.
One of the challenges of the ASC task is the domain shift between training and testing data.
We introduce the task Semi-supervised Acoustic Scene Classification under Domain Shift in the ICME 2024 Grand Challenge.
arXiv Detail & Related papers (2024-02-05T03:12:51Z) - Wider or Deeper Neural Network Architecture for Acoustic Scene
Classification with Mismatched Recording Devices [59.86658316440461]
We present a robust and low complexity system for Acoustic Scene Classification (ASC)
We first construct an ASC baseline system in which a novel inception-residual-based network architecture is proposed to deal with the mismatched recording device issue.
To further improve the performance but still satisfy the low complexity model, we apply two techniques: ensemble of multiple spectrograms and channel reduction.
arXiv Detail & Related papers (2022-03-23T10:27:41Z) - A study on joint modeling and data augmentation of multi-modalities for
audio-visual scene classification [64.59834310846516]
We propose two techniques, namely joint modeling and data augmentation, to improve system performances for audio-visual scene classification (AVSC)
Our final system can achieve the best accuracy of 94.2% among all single AVSC systems submitted to DCASE 2021 Task 1b.
arXiv Detail & Related papers (2022-03-07T07:29:55Z) - A Lottery Ticket Hypothesis Framework for Low-Complexity Device-Robust
Neural Acoustic Scene Classification [78.04177357888284]
We propose a novel neural model compression strategy combining data augmentation, knowledge transfer, pruning, and quantization for device-robust acoustic scene classification (ASC)
We report an efficient joint framework for low-complexity multi-device ASC, called Acoustic Lottery.
arXiv Detail & Related papers (2021-07-03T16:25:24Z) - A Two-Stage Approach to Device-Robust Acoustic Scene Classification [63.98724740606457]
Two-stage system based on fully convolutional neural networks (CNNs) is proposed to improve device robustness.
Our results show that the proposed ASC system attains a state-of-the-art accuracy on the development set.
Neural saliency analysis with class activation mapping gives new insights on the patterns learnt by our models.
arXiv Detail & Related papers (2020-11-03T03:27:18Z) - Device-Robust Acoustic Scene Classification Based on Two-Stage
Categorization and Data Augmentation [63.98724740606457]
We present a joint effort of four groups, namely GT, USTC, Tencent, and UKE, to tackle Task 1 - Acoustic Scene Classification (ASC) in the DCASE 2020 Challenge.
Task 1a focuses on ASC of audio signals recorded with multiple (real and simulated) devices into ten different fine-grained classes.
Task 1b concerns with classification of data into three higher-level classes using low-complexity solutions.
arXiv Detail & Related papers (2020-07-16T15:07:14Z) - Multi-Task Network for Noise-Robust Keyword Spotting and Speaker
Verification using CTC-based Soft VAD and Global Query Attention [13.883985850789443]
Keywords spotting (KWS) and speaker verification (SV) have been studied independently but acoustic and speaker domains are complementary.
We propose a multi-task network that performs KWS and SV simultaneously to fully utilize the interrelated domain information.
arXiv Detail & Related papers (2020-05-08T05:58:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.