Low-Complexity Models for Acoustic Scene Classification Based on
Receptive Field Regularization and Frequency Damping
- URL: http://arxiv.org/abs/2011.02955v1
- Date: Thu, 5 Nov 2020 16:34:11 GMT
- Title: Low-Complexity Models for Acoustic Scene Classification Based on
Receptive Field Regularization and Frequency Damping
- Authors: Khaled Koutini, Florian Henkel, Hamid Eghbal-zadeh, Gerhard Widmer
- Abstract summary: We investigate and compare several well-known methods to reduce the number of parameters in neural networks.
We show that we can achieve high-performing low-complexity models by applying specific restrictions on the Receptive Field.
We propose a filter-damping technique for regularizing the RF of models, without altering their architecture.
- Score: 7.0349768355860895
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep Neural Networks are known to be very demanding in terms of computing and
memory requirements. Due to the ever increasing use of embedded systems and
mobile devices with a limited resource budget, designing low-complexity models
without sacrificing too much of their predictive performance gained great
importance. In this work, we investigate and compare several well-known methods
to reduce the number of parameters in neural networks. We further put these
into the context of a recent study on the effect of the Receptive Field (RF) on
a model's performance, and empirically show that we can achieve high-performing
low-complexity models by applying specific restrictions on the RFs, in
combination with parameter reduction methods. Additionally, we propose a
filter-damping technique for regularizing the RF of models, without altering
their architecture and changing their parameter counts. We will show that
incorporating this technique improves the performance in various low-complexity
settings such as pruning and decomposed convolution. Using our proposed filter
damping, we achieved the 1st rank at the DCASE-2020 Challenge in the task of
Low-Complexity Acoustic Scene Classification.
Related papers
- Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models [85.67096251281191]
We present an innovative approach to model fusion called zero-shot Sparse MIxture of Low-rank Experts (SMILE) construction.
SMILE allows for the upscaling of source models into an MoE model without extra data or further training.
We conduct extensive experiments across diverse scenarios, such as image classification and text generation tasks, using full fine-tuning and LoRA fine-tuning.
arXiv Detail & Related papers (2024-08-19T17:32:15Z) - Adv-KD: Adversarial Knowledge Distillation for Faster Diffusion Sampling [2.91204440475204]
Diffusion Probabilistic Models (DPMs) have emerged as a powerful class of deep generative models.
They rely on sequential denoising steps during sample generation.
We propose a novel method that integrates denoising phases directly into the model's architecture.
arXiv Detail & Related papers (2024-05-31T08:19:44Z) - Edge-Efficient Deep Learning Models for Automatic Modulation Classification: A Performance Analysis [0.7428236410246183]
We investigate optimized convolutional neural networks (CNNs) developed for automatic modulation classification (AMC) of wireless signals.
We propose optimized models with the combinations of these techniques to fuse the complementary optimization benefits.
The experimental results show that the proposed individual and combined optimization techniques are highly effective for developing models with significantly less complexity.
arXiv Detail & Related papers (2024-04-11T06:08:23Z) - DDMI: Domain-Agnostic Latent Diffusion Models for Synthesizing High-Quality Implicit Neural Representations [13.357094648241839]
Domain-agnostic Latent Diffusion Model for INRs generates adaptive positional embeddings instead of neural networks' weights.
We develop a decomposed-to-continuous space Variational AutoEncoder (D2C-VAE), which seamlessly connects discrete data and the continuous signal functions.
Experiments across four modalities, e.g., 2D images, 3D shapes, Neural Radiance Fields, and videos, with seven benchmark datasets, demonstrate the versatility of DDMI.
arXiv Detail & Related papers (2024-01-23T06:21:34Z) - Domain Generalization Guided by Gradient Signal to Noise Ratio of
Parameters [69.24377241408851]
Overfitting to the source domain is a common issue in gradient-based training of deep neural networks.
We propose to base the selection on gradient-signal-to-noise ratio (GSNR) of network's parameters.
arXiv Detail & Related papers (2023-10-11T10:21:34Z) - Conditional Denoising Diffusion for Sequential Recommendation [62.127862728308045]
Two prominent generative models, Generative Adversarial Networks (GANs) and Variational AutoEncoders (VAEs)
GANs suffer from unstable optimization, while VAEs are prone to posterior collapse and over-smoothed generations.
We present a conditional denoising diffusion model, which includes a sequence encoder, a cross-attentive denoising decoder, and a step-wise diffuser.
arXiv Detail & Related papers (2023-04-22T15:32:59Z) - Phantom Embeddings: Using Embedding Space for Model Regularization in
Deep Neural Networks [12.293294756969477]
The strength of machine learning models stems from their ability to learn complex function approximations from data.
The complex models tend to memorize the training data, which results in poor regularization performance on test data.
We present a novel approach to regularize the models by leveraging the information-rich latent embeddings and their high intra-class correlation.
arXiv Detail & Related papers (2023-04-14T17:15:54Z) - Revisit Geophysical Imaging in A New View of Physics-informed Generative
Adversarial Learning [2.12121796606941]
Full waveform inversion produces high-resolution subsurface models.
FWI with least-squares function suffers from many drawbacks such as the local-minima problem.
Recent works relying on partial differential equations and neural networks show promising performance for two-dimensional FWI.
We propose an unsupervised learning paradigm that integrates wave equation with a discriminate network to accurately estimate the physically consistent models.
arXiv Detail & Related papers (2021-09-23T15:54:40Z) - Deep Variational Models for Collaborative Filtering-based Recommender
Systems [63.995130144110156]
Deep learning provides accurate collaborative filtering models to improve recommender system results.
Our proposed models apply the variational concept to injectity in the latent space of the deep architecture.
Results show the superiority of the proposed approach in scenarios where the variational enrichment exceeds the injected noise effect.
arXiv Detail & Related papers (2021-07-27T08:59:39Z) - Rate Distortion Characteristic Modeling for Neural Image Compression [59.25700168404325]
End-to-end optimization capability offers neural image compression (NIC) superior lossy compression performance.
distinct models are required to be trained to reach different points in the rate-distortion (R-D) space.
We make efforts to formulate the essential mathematical functions to describe the R-D behavior of NIC using deep network and statistical modeling.
arXiv Detail & Related papers (2021-06-24T12:23:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.