DD-CNN: Depthwise Disout Convolutional Neural Network for Low-complexity
Acoustic Scene Classification
- URL: http://arxiv.org/abs/2007.12864v1
- Date: Sat, 25 Jul 2020 06:02:20 GMT
- Title: DD-CNN: Depthwise Disout Convolutional Neural Network for Low-complexity
Acoustic Scene Classification
- Authors: Jingqiao Zhao, Zhen-Hua Feng, Qiuqiang Kong, Xiaoning Song, Xiao-Jun
Wu
- Abstract summary: This paper presents a Depthwise Disout Convolutional Neural Network (DD-CNN) for the detection and classification of urban acoustic scenes.
We use log-mel as feature representations of acoustic signals for the inputs of our network.
In the proposed DD-CNN, depthwise separable convolution is used to reduce the network complexity.
- Score: 29.343805468175965
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents a Depthwise Disout Convolutional Neural Network (DD-CNN)
for the detection and classification of urban acoustic scenes. Specifically, we
use log-mel as feature representations of acoustic signals for the inputs of
our network. In the proposed DD-CNN, depthwise separable convolution is used to
reduce the network complexity. Besides, SpecAugment and Disout are used for
further performance boosting. Experimental results demonstrate that our DD-CNN
can learn discriminative acoustic characteristics from audio fragments and
effectively reduce the network complexity. Our DD-CNN was used for the
low-complexity acoustic scene classification task of the DCASE2020 Challenge,
which achieves 92.04% accuracy on the validation set.
Related papers
- Noise Adaptor: Enhancing Low-Latency Spiking Neural Networks through Noise-Injected Low-Bit ANN Conversion [3.8674054882510065]
Noise Adaptor is a novel method for constructing competitive low-latency spiking neural networks (SNNs)
By injecting noise during quantized ANN training, Noise Adaptor better accounts for the dynamic differences between ANNs and SNNs.
Unlike previous methods, Noise Adaptor does not require the application of run-time noise correction techniques in SNNs.
arXiv Detail & Related papers (2024-11-26T13:39:52Z) - TBSN: Transformer-Based Blind-Spot Network for Self-Supervised Image Denoising [94.09442506816724]
Blind-spot networks (BSN) have been prevalent network architectures in self-supervised image denoising (SSID)
We present a transformer-based blind-spot network (TBSN) by analyzing and redesigning the transformer operators that meet the blind-spot requirement.
For spatial self-attention, an elaborate mask is applied to the attention matrix to restrict its receptive field, thus mimicking the dilated convolution.
For channel self-attention, we observe that it may leak the blind-spot information when the channel number is greater than spatial size in the deep layers of multi-scale architectures.
arXiv Detail & Related papers (2024-04-11T15:39:10Z) - sVAD: A Robust, Low-Power, and Light-Weight Voice Activity Detection
with Spiking Neural Networks [51.516451451719654]
Spiking Neural Networks (SNNs) are known to be biologically plausible and power-efficient.
This paper introduces a novel SNN-based Voice Activity Detection model, referred to as sVAD.
It provides effective auditory feature representation through SincNet and 1D convolution, and improves noise robustness with attention mechanisms.
arXiv Detail & Related papers (2024-03-09T02:55:44Z) - Hopfield-Enhanced Deep Neural Networks for Artifact-Resilient Brain
State Decoding [0.0]
We propose a two-stage computational framework combining Hopfield Networks for artifact data preprocessing with Conal Neural Networks (CNNs) for classification of brain states in rat neural recordings under different levels of anesthesia.
Performance across various levels of data compression and noise intensities showed that our framework can effectively mitigate artifacts, allowing the model to reach parity with the clean-data CNN at lower noise levels.
arXiv Detail & Related papers (2023-11-06T15:08:13Z) - Spiking Neural Network Decision Feedback Equalization [70.3497683558609]
We propose an SNN-based equalizer with a feedback structure akin to the decision feedback equalizer (DFE)
We show that our approach clearly outperforms conventional linear equalizers for three different exemplary channels.
The proposed SNN with a decision feedback structure enables the path to competitive energy-efficient transceivers.
arXiv Detail & Related papers (2022-11-09T09:19:15Z) - Noise Injection as a Probe of Deep Learning Dynamics [0.0]
We propose a new method to probe the learning mechanism of Deep Neural Networks (DNN) by perturbing the system using Noise Injection Nodes (NINs)
We find that the system displays distinct phases during training, dictated by the scale of injected noise.
In some cases, the evolution of the noise nodes is similar to that of the unperturbed loss, thus indicating the possibility of using NINs to learn more about the full system in the future.
arXiv Detail & Related papers (2022-10-24T20:51:59Z) - Training High-Performance Low-Latency Spiking Neural Networks by
Differentiation on Spike Representation [70.75043144299168]
Spiking Neural Network (SNN) is a promising energy-efficient AI model when implemented on neuromorphic hardware.
It is a challenge to efficiently train SNNs due to their non-differentiability.
We propose the Differentiation on Spike Representation (DSR) method, which could achieve high performance.
arXiv Detail & Related papers (2022-05-01T12:44:49Z) - Noise Sensitivity-Based Energy Efficient and Robust Adversary Detection
in Neural Networks [3.125321230840342]
Adversarial examples are inputs that have been carefully perturbed to fool classifier networks, while appearing unchanged to humans.
We propose a structured methodology of augmenting a deep neural network (DNN) with a detector subnetwork.
We show that our method improves state-of-the-art detector robustness against adversarial examples.
arXiv Detail & Related papers (2021-01-05T14:31:53Z) - Deep Networks for Direction-of-Arrival Estimation in Low SNR [89.45026632977456]
We introduce a Convolutional Neural Network (CNN) that is trained from mutli-channel data of the true array manifold matrix.
We train a CNN in the low-SNR regime to predict DoAs across all SNRs.
Our robust solution can be applied in several fields, ranging from wireless array sensors to acoustic microphones or sonars.
arXiv Detail & Related papers (2020-11-17T12:52:18Z) - Attention Driven Fusion for Multi-Modal Emotion Recognition [39.295892047505816]
We present a deep learning-based approach to exploit and fuse text and acoustic data for emotion classification.
We use a SincNet layer, based on parameterized sinc functions with band-pass filters, to extract acoustic features from raw audio followed by a DCNN.
For text processing, we use two branches (a DCNN and a Bi-direction RNN followed by a DCNN) in parallel where cross attention is introduced to infer the N-gram level correlations.
arXiv Detail & Related papers (2020-09-23T08:07:58Z) - Progressive Tandem Learning for Pattern Recognition with Deep Spiking
Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency.
We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.