Iterative autoregression: a novel trick to improve your low-latency
speech enhancement model
- URL: http://arxiv.org/abs/2211.01751v4
- Date: Tue, 5 Dec 2023 11:36:32 GMT
- Title: Iterative autoregression: a novel trick to improve your low-latency
speech enhancement model
- Authors: Pavel Andreev, Nicholas Babaev, Azat Saginbaev, Ivan Shchekotov, Aibek
Alanov
- Abstract summary: Streaming models are an essential component of real-time speech enhancement tools.
We propose a straightforward yet effective alternative technique for training autoregressive low-latency speech enhancement models.
- Score: 2.2999148299770047
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Streaming models are an essential component of real-time speech enhancement
tools. The streaming regime constrains speech enhancement models to use only a
tiny context of future information. As a result, the low-latency streaming
setup is generally considered a challenging task and has a significant negative
impact on the model's quality. However, the sequential nature of streaming
generation offers a natural possibility for autoregression, that is, utilizing
previous predictions while making current ones. The conventional method for
training autoregressive models is teacher forcing, but its primary drawback
lies in the training-inference mismatch that can lead to a substantial
degradation in quality. In this study, we propose a straightforward yet
effective alternative technique for training autoregressive low-latency speech
enhancement models. We demonstrate that the proposed approach leads to stable
improvement across diverse architectures and training scenarios.
Related papers
- SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation [52.6922833948127]
In this work, we investigate the importance of parameters in pre-trained diffusion models.
We propose a novel model fine-tuning method to make full use of these ineffective parameters.
Our method enhances the generative capabilities of pre-trained models in downstream applications.
arXiv Detail & Related papers (2024-09-10T16:44:47Z) - Adversarial Robustification via Text-to-Image Diffusion Models [56.37291240867549]
Adrial robustness has been conventionally believed as a challenging property to encode for neural networks.
We develop a scalable and model-agnostic solution to achieve adversarial robustness without using any data.
arXiv Detail & Related papers (2024-07-26T10:49:14Z) - Denoising Autoregressive Representation Learning [13.185567468951628]
Our method, DARL, employs a decoder-only Transformer to predict image patches autoregressively.
We show that the learned representation can be improved by using tailored noise schedules and longer training in larger models.
arXiv Detail & Related papers (2024-03-08T10:19:00Z) - Deep autoregressive density nets vs neural ensembles for model-based
offline reinforcement learning [2.9158689853305693]
We consider a model-based reinforcement learning algorithm that infers the system dynamics from the available data and performs policy optimization on imaginary model rollouts.
This approach is vulnerable to exploiting model errors which can lead to catastrophic failures on the real system.
We show that better performance can be obtained with a single well-calibrated autoregressive model on the D4RL benchmark.
arXiv Detail & Related papers (2024-02-05T10:18:15Z) - Guided Flows for Generative Modeling and Decision Making [55.42634941614435]
We show that Guided Flows significantly improves the sample quality in conditional image generation and zero-shot text synthesis-to-speech.
Notably, we are first to apply flow models for plan generation in the offline reinforcement learning setting ax speedup in compared to diffusion models.
arXiv Detail & Related papers (2023-11-22T15:07:59Z) - How to Fine-tune the Model: Unified Model Shift and Model Bias Policy
Optimization [13.440645736306267]
This paper develops an algorithm for model-based reinforcement learning.
It unifies model shift and model bias and then formulates a fine-tuning process.
It achieves state-of-the-art performance on several challenging benchmark tasks.
arXiv Detail & Related papers (2023-09-22T07:27:32Z) - Shattering the Agent-Environment Interface for Fine-Tuning Inclusive
Language Models [24.107358120517336]
In this work, we adopt a novel perspective wherein a pre-trained language model is itself simultaneously a policy, reward function, and transition function.
An immediate consequence of this is that reward learning and language model fine-tuning can be performed jointly and directly, without requiring any further downstream policy optimization.
arXiv Detail & Related papers (2023-05-19T06:21:15Z) - Learning to Learn Transferable Attack [77.67399621530052]
Transfer adversarial attack is a non-trivial black-box adversarial attack that aims to craft adversarial perturbations on the surrogate model and then apply such perturbations to the victim model.
We propose a Learning to Learn Transferable Attack (LLTA) method, which makes the adversarial perturbations more generalized via learning from both data and model augmentation.
Empirical results on the widely-used dataset demonstrate the effectiveness of our attack method with a 12.85% higher success rate of transfer attack compared with the state-of-the-art methods.
arXiv Detail & Related papers (2021-12-10T07:24:21Z) - Learning Rich Nearest Neighbor Representations from Self-supervised
Ensembles [60.97922557957857]
We provide a framework to perform self-supervised model ensembling via a novel method of learning representations directly through gradient descent at inference time.
This technique improves representation quality, as measured by k-nearest neighbors, both on the in-domain dataset and in the transfer setting.
arXiv Detail & Related papers (2021-10-19T22:24:57Z) - Powerpropagation: A sparsity inducing weight reparameterisation [65.85142037667065]
We introduce Powerpropagation, a new weight- parameterisation for neural networks that leads to inherently sparse models.
Models trained in this manner exhibit similar performance, but have a distribution with markedly higher density at zero, allowing more parameters to be pruned safely.
Here, we combine Powerpropagation with a traditional weight-pruning technique as well as recent state-of-the-art sparse-to-sparse algorithms, showing superior performance on the ImageNet benchmark.
arXiv Detail & Related papers (2021-10-01T10:03:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.