Diet deep generative audio models with structured lottery
- URL: http://arxiv.org/abs/2007.16170v1
- Date: Fri, 31 Jul 2020 16:43:10 GMT
- Title: Diet deep generative audio models with structured lottery
- Authors: Philippe Esling, Ninon Devis, Adrien Bitton, Antoine Caillon, Axel
Chemla--Romeu-Santos, Constance Douwes
- Abstract summary: We study the lottery ticket hypothesis on deep generative audio models.
We show that we can remove up to 95% of the model weights without significant degradation in accuracy.
We discuss the possibility of implementing deep generative audio models on embedded platforms.
- Score: 2.348805691644086
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Deep learning models have provided extremely successful solutions in most
audio application fields. However, the high accuracy of these models comes at
the expense of a tremendous computation cost. This aspect is almost always
overlooked in evaluating the quality of proposed models. However, models should
not be evaluated without taking into account their complexity. This aspect is
especially critical in audio applications, which heavily relies on specialized
embedded hardware with real-time constraints. In this paper, we build on recent
observations that deep models are highly overparameterized, by studying the
lottery ticket hypothesis on deep generative audio models. This hypothesis
states that extremely efficient small sub-networks exist in deep models and
would provide higher accuracy than larger models if trained in isolation.
However, lottery tickets are found by relying on unstructured masking, which
means that resulting models do not provide any gain in either disk size or
inference time. Instead, we develop here a method aimed at performing
structured trimming. We show that this requires to rely on global selection and
introduce a specific criterion based on mutual information. First, we confirm
the surprising result that smaller models provide higher accuracy than their
large counterparts. We further show that we can remove up to 95% of the model
weights without significant degradation in accuracy. Hence, we can obtain very
light models for generative audio across popular methods such as Wavenet, SING
or DDSP, that are up to 100 times smaller with commensurate accuracy. We study
the theoretical bounds for embedding these models on Raspberry Pi and Arduino,
and show that we can obtain generative models on CPU with equivalent quality as
large GPU models. Finally, we discuss the possibility of implementing deep
generative audio models on embedded platforms.
Related papers
- Large Language Model Pruning [0.0]
We suggest a model pruning technique specifically focused on LLMs.
The proposed methodology emphasizes the explainability of deep learning models.
We also explore the difference between pruning on large-scale models vs. pruning on small-scale models.
arXiv Detail & Related papers (2024-05-24T18:22:15Z) - SlimmeRF: Slimmable Radiance Fields [4.743863123290521]
We present SlimmeRF, a model that allows for instant test-time trade-offs between model size and accuracy through slimming.
We also observe that our model allows for more effective trade-offs in sparse-view scenarios, at times even achieving higher accuracy after being slimmed.
arXiv Detail & Related papers (2023-12-15T18:59:55Z) - Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models.
This creates a barrier to fusing knowledge across individual models to yield a better single model.
We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z) - Part-Based Models Improve Adversarial Robustness [57.699029966800644]
We show that combining human prior knowledge with end-to-end learning can improve the robustness of deep neural networks.
Our model combines a part segmentation model with a tiny classifier and is trained end-to-end to simultaneously segment objects into parts.
Our experiments indicate that these models also reduce texture bias and yield better robustness against common corruptions and spurious correlations.
arXiv Detail & Related papers (2022-09-15T15:41:47Z) - Predicting on the Edge: Identifying Where a Larger Model Does Better [61.793778186198864]
We show that large models have the largest improvement on examples where the small model is most uncertain.
We show that a switcher model which defers examples to a larger model when a small model is uncertain can achieve striking improvements in performance and resource usage.
arXiv Detail & Related papers (2022-02-15T18:53:14Z) - Load-balanced Gather-scatter Patterns for Sparse Deep Neural Networks [20.374784902476318]
Pruning, as a method to introduce zeros to model weights, has shown to be an effective method to provide good trade-offs between model accuracy and computation efficiency.
Some modern processors are equipped with fast on-chip scratchpad memories and gather/scatter engines that perform indirect load and store operations on such memories.
In this work, we propose a set of novel sparse patterns, named gather-scatter (GS) patterns, to utilize the scratchpad memories and gather/scatter engines to speed up neural network inferences.
arXiv Detail & Related papers (2021-12-20T22:55:45Z) - Exploring Sparse Expert Models and Beyond [51.90860155810848]
Mixture-of-Experts (MoE) models can achieve promising results with outrageous large amount of parameters but constant computation cost.
We propose a simple method called expert prototyping that splits experts into different prototypes and applies $k$ top-$1$ routing.
This strategy improves the model quality but maintains constant computational costs, and our further exploration on extremely large-scale models reflects that it is more effective in training larger models.
arXiv Detail & Related papers (2021-05-31T16:12:44Z) - ANNETTE: Accurate Neural Network Execution Time Estimation with Stacked
Models [56.21470608621633]
We propose a time estimation framework to decouple the architectural search from the target hardware.
The proposed methodology extracts a set of models from micro- kernel and multi-layer benchmarks and generates a stacked model for mapping and network execution time estimation.
We compare estimation accuracy and fidelity of the generated mixed models, statistical models with the roofline model, and a refined roofline model for evaluation.
arXiv Detail & Related papers (2021-05-07T11:39:05Z) - Real-time Denoising and Dereverberation with Tiny Recurrent U-Net [12.533488149023025]
We propose Tiny Recurrent U-Net (TRU-Net), a lightweight online inference model that matches the performance of current state-of-the-art models.
The size of the quantized version of TRU-Net is 362 kilobytes, which is small enough to be deployed on edge devices.
Results of both objective and subjective evaluations have shown that our model can achieve competitive performance with the current state-of-the-art models.
arXiv Detail & Related papers (2021-02-05T14:46:41Z) - Ultra-light deep MIR by trimming lottery tickets [1.2599533416395767]
We propose a model pruning method based on the lottery ticket hypothesis.
We show that our proposal can remove up to 90% of the model parameters without loss of accuracy.
We confirm the surprising result that, at smaller compression ratios, lighter models consistently outperform their heavier counterparts.
arXiv Detail & Related papers (2020-07-31T17:30:28Z) - When Ensembling Smaller Models is More Efficient than Single Large
Models [52.38997176317532]
We show that ensembles can outperform single models with both higher accuracy and requiring fewer total FLOPs to compute.
This presents an interesting observation that output diversity in ensembling can often be more efficient than training larger models.
arXiv Detail & Related papers (2020-05-01T18:56:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.