Iterative autoregression: a novel trick to improve your low-latency
speech enhancement model
- URL: http://arxiv.org/abs/2211.01751v4
- Date: Tue, 5 Dec 2023 11:36:32 GMT
- Title: Iterative autoregression: a novel trick to improve your low-latency
speech enhancement model
- Authors: Pavel Andreev, Nicholas Babaev, Azat Saginbaev, Ivan Shchekotov, Aibek
Alanov
- Abstract summary: Streaming models are an essential component of real-time speech enhancement tools.
We propose a straightforward yet effective alternative technique for training autoregressive low-latency speech enhancement models.
- Score: 2.2999148299770047
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Streaming models are an essential component of real-time speech enhancement
tools. The streaming regime constrains speech enhancement models to use only a
tiny context of future information. As a result, the low-latency streaming
setup is generally considered a challenging task and has a significant negative
impact on the model's quality. However, the sequential nature of streaming
generation offers a natural possibility for autoregression, that is, utilizing
previous predictions while making current ones. The conventional method for
training autoregressive models is teacher forcing, but its primary drawback
lies in the training-inference mismatch that can lead to a substantial
degradation in quality. In this study, we propose a straightforward yet
effective alternative technique for training autoregressive low-latency speech
enhancement models. We demonstrate that the proposed approach leads to stable
improvement across diverse architectures and training scenarios.
Related papers
- Autoregressive Video Generation without Vector Quantization [90.87907377618747]
We reformulate the video generation problem as a non-quantized autoregressive modeling of temporal frame-by-frame prediction.
With the proposed approach, we train a novel video autoregressive model without vector quantization, termed NOVA.
Our results demonstrate that NOVA surpasses prior autoregressive video models in data efficiency, inference speed, visual fidelity, and video fluency, even with a much smaller model capacity.
arXiv Detail & Related papers (2024-12-18T18:59:53Z) - Boosting Alignment for Post-Unlearning Text-to-Image Generative Models [55.82190434534429]
Large-scale generative models have shown impressive image-generation capabilities, propelled by massive data.
This often inadvertently leads to the generation of harmful or inappropriate content and raises copyright concerns.
We propose a framework that seeks an optimal model update at each unlearning iteration, ensuring monotonic improvement on both objectives.
arXiv Detail & Related papers (2024-12-09T21:36:10Z) - Self-Improvement in Language Models: The Sharpening Mechanism [70.9248553790022]
We offer a new perspective on the capabilities of self-improvement through a lens we refer to as sharpening.
Motivated by the observation that language models are often better at verifying response quality than they are at generating correct responses, we formalize self-improvement as using the model itself as a verifier during post-training.
We analyze two natural families of self-improvement algorithms based on SFT and RLHF.
arXiv Detail & Related papers (2024-12-02T20:24:17Z) - Deep autoregressive density nets vs neural ensembles for model-based
offline reinforcement learning [2.9158689853305693]
We consider a model-based reinforcement learning algorithm that infers the system dynamics from the available data and performs policy optimization on imaginary model rollouts.
This approach is vulnerable to exploiting model errors which can lead to catastrophic failures on the real system.
We show that better performance can be obtained with a single well-calibrated autoregressive model on the D4RL benchmark.
arXiv Detail & Related papers (2024-02-05T10:18:15Z) - Guided Flows for Generative Modeling and Decision Making [55.42634941614435]
We show that Guided Flows significantly improves the sample quality in conditional image generation and zero-shot text synthesis-to-speech.
Notably, we are first to apply flow models for plan generation in the offline reinforcement learning setting ax speedup in compared to diffusion models.
arXiv Detail & Related papers (2023-11-22T15:07:59Z) - How to Fine-tune the Model: Unified Model Shift and Model Bias Policy
Optimization [13.440645736306267]
This paper develops an algorithm for model-based reinforcement learning.
It unifies model shift and model bias and then formulates a fine-tuning process.
It achieves state-of-the-art performance on several challenging benchmark tasks.
arXiv Detail & Related papers (2023-09-22T07:27:32Z) - Shattering the Agent-Environment Interface for Fine-Tuning Inclusive
Language Models [24.107358120517336]
In this work, we adopt a novel perspective wherein a pre-trained language model is itself simultaneously a policy, reward function, and transition function.
An immediate consequence of this is that reward learning and language model fine-tuning can be performed jointly and directly, without requiring any further downstream policy optimization.
arXiv Detail & Related papers (2023-05-19T06:21:15Z) - Learning Rich Nearest Neighbor Representations from Self-supervised
Ensembles [60.97922557957857]
We provide a framework to perform self-supervised model ensembling via a novel method of learning representations directly through gradient descent at inference time.
This technique improves representation quality, as measured by k-nearest neighbors, both on the in-domain dataset and in the transfer setting.
arXiv Detail & Related papers (2021-10-19T22:24:57Z) - Powerpropagation: A sparsity inducing weight reparameterisation [65.85142037667065]
We introduce Powerpropagation, a new weight- parameterisation for neural networks that leads to inherently sparse models.
Models trained in this manner exhibit similar performance, but have a distribution with markedly higher density at zero, allowing more parameters to be pruned safely.
Here, we combine Powerpropagation with a traditional weight-pruning technique as well as recent state-of-the-art sparse-to-sparse algorithms, showing superior performance on the ImageNet benchmark.
arXiv Detail & Related papers (2021-10-01T10:03:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.