Related papers: Continuous Autoregressive Models with Noise Augmentation Avoid Error Accumulation

Continuous Autoregressive Models with Noise Augmentation Avoid Error Accumulation

URL: http://arxiv.org/abs/2411.18447v1
Date: Wed, 27 Nov 2024 15:38:20 GMT
Title: Continuous Autoregressive Models with Noise Augmentation Avoid Error Accumulation
Authors: Marco Pasini, Javier Nistal, Stefan Lattner, George Fazekas,
Abstract summary: Continuous Autoregressive Models can suffer from a decline in generation quality over extended sequences due to error accumulation during inference.<n>We introduce a novel method to address this issue by injecting random noise into the input embeddings during training.<n>This work paves the way for generating continuous embeddings in a purely autoregressive setting, opening new possibilities for real-time and interactive generative applications.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Autoregressive models are typically applied to sequences of discrete tokens, but recent research indicates that generating sequences of continuous embeddings in an autoregressive manner is also feasible. However, such Continuous Autoregressive Models (CAMs) can suffer from a decline in generation quality over extended sequences due to error accumulation during inference. We introduce a novel method to address this issue by injecting random noise into the input embeddings during training. This procedure makes the model robust against varying error levels at inference. We further reduce error accumulation through an inference procedure that introduces low-level noise. Experiments on musical audio generation show that CAM substantially outperforms existing autoregressive and non-autoregressive approaches while preserving audio quality over extended sequences. This work paves the way for generating continuous embeddings in a purely autoregressive setting, opening new possibilities for real-time and interactive generative applications.

Related papers

Mitigating the Noise Shift for Denoising Generative Models via Noise Awareness Guidance [54.88271057438763]
Noise Awareness Guidance (NAG) is a correction method that explicitly steers sampling trajectories to remain consistent with the pre-defined noise schedule.<n>NAG consistently mitigates noise shift and substantially improves the generation quality of mainstream diffusion models.
arXiv Detail & Related papers (2025-10-14T13:31:34Z)
Speculative Jacobi-Denoising Decoding for Accelerating Autoregressive Text-to-image Generation [110.28291466364784]
Speculative Jacobi-Denoising Decoding (SJD2) is a framework that incorporates the denoising process into Jacobi to enable parallel token generation in autoregressive models.<n>Our method introduces a next-clean-token prediction paradigm that enables the pre-trained autoregressive models to accept noise-perturbed token embeddings.
arXiv Detail & Related papers (2025-10-10T04:30:45Z)
A self-supervised learning approach for denoising autoregressive models with additive noise: finite and infinite variance cases [0.9217021281095907]
In applications, autoregressive signals are often corrupted by additive noise.<n>In this paper, we propose a novel self-supervised learning method to denoise the additive noise-corrupted autoregressive model.
arXiv Detail & Related papers (2025-08-18T14:46:56Z)
Hybrid Autoregressive-Diffusion Model for Real-Time Streaming Sign Language Production [0.0]
We introduce a hybrid approach combining autoregressive and diffusion models to generate Sign Language Production (SLP) models.<n>To capture fine-grained body movements, we design a Multi-Scale Pose Representation module that separately extracts detailed features from distinct arttors.<n>We also introduce a Confidence-Aware Causal Attention mechanism that utilizes joint-level confidence scores to dynamically guide the pose generation process.
arXiv Detail & Related papers (2025-07-12T01:34:50Z)
Fast Autoregressive Models for Continuous Latent Generation [49.079819389916764]
Autoregressive models have demonstrated remarkable success in sequential data generation, particularly in NLP. Recent work, the masked autoregressive model (MAR) bypasses quantization by modeling per-token distributions in continuous spaces using a diffusion head. We propose Fast AutoRegressive model (FAR), a novel framework that replaces MAR's diffusion head with a lightweight shortcut head.
arXiv Detail & Related papers (2025-04-24T13:57:08Z)
Unifying Autoregressive and Diffusion-Based Sequence Generation [2.3923884480793673]
We present extensions to diffusion-based sequence generation models, blurring the line with autoregressive language models. We introduce hyperschedules, which assign distinct noise schedules to individual token positions. Second, we propose two hybrid token-wise noising processes that interpolate between absorbing and uniform processes, enabling the model to fix past mistakes.
arXiv Detail & Related papers (2025-04-08T20:32:10Z)
Constrained Discrete Diffusion [61.81569616239755]
This paper introduces Constrained Discrete Diffusion (CDD), a novel integration of differentiable constraint optimization within the diffusion process.<n>CDD directly imposes constraints into the discrete diffusion sampling process, resulting in a training-free and effective approach.
arXiv Detail & Related papers (2025-03-12T19:48:12Z)
Non-autoregressive Sequence-to-Sequence Vision-Language Models [63.77614880533488]
We propose a parallel decoding sequence-to-sequence vision-language model that marginalizes over multiple inference paths in the decoder. The model achieves performance on-par with its state-of-the-art autoregressive counterpart, but is faster at inference time.
arXiv Detail & Related papers (2024-03-04T17:34:59Z)
One More Step: A Versatile Plug-and-Play Module for Rectifying Diffusion Schedule Flaws and Enhancing Low-Frequency Controls [77.42510898755037]
One More Step (OMS) is a compact network that incorporates an additional simple yet effective step during inference. OMS elevates image fidelity and harmonizes the dichotomy between training and inference, while preserving original model parameters. Once trained, various pre-trained diffusion models with the same latent domain can share the same OMS module.
arXiv Detail & Related papers (2023-11-27T12:02:42Z)
SequenceMatch: Imitation Learning for Autoregressive Sequence Modelling with Backtracking [60.109453252858806]
A maximum-likelihood (MLE) objective does not match a downstream use-case of autoregressively generating high-quality sequences. We formulate sequence generation as an imitation learning (IL) problem. This allows us to minimize a variety of divergences between the distribution of sequences generated by an autoregressive model and sequences from a dataset. Our resulting method, SequenceMatch, can be implemented without adversarial training or architectural changes.
arXiv Detail & Related papers (2023-06-08T17:59:58Z)
Conditional Denoising Diffusion for Sequential Recommendation [62.127862728308045]
Two prominent generative models, Generative Adversarial Networks (GANs) and Variational AutoEncoders (VAEs) GANs suffer from unstable optimization, while VAEs are prone to posterior collapse and over-smoothed generations. We present a conditional denoising diffusion model, which includes a sequence encoder, a cross-attentive denoising decoder, and a step-wise diffuser.
arXiv Detail & Related papers (2023-04-22T15:32:59Z)
DINOISER: Diffused Conditional Sequence Learning by Manipulating Noises [38.72460741779243]
We introduce DINOISER to facilitate diffusion models for sequence generation by manipulating noises. Experiments show that DINOISER enables consistent improvement over the baselines of previous diffusion-based sequence generative models.
arXiv Detail & Related papers (2023-02-20T15:14:46Z)
Symbolic Music Generation with Diffusion Models [4.817429789586127]
We present a technique for training diffusion models on sequential data by parameterizing the discrete domain in the continuous latent space of a pre-trained variational autoencoder. We show strong unconditional generation and post-hoc conditional infilling results compared to autoregressive language models operating over the same continuous embeddings.
arXiv Detail & Related papers (2021-03-30T05:48:05Z)
Conditional Hybrid GAN for Sequence Generation [56.67961004064029]
We propose a novel conditional hybrid GAN (C-Hybrid-GAN) to solve this issue. We exploit the Gumbel-Softmax technique to approximate the distribution of discrete-valued sequences. We demonstrate that the proposed C-Hybrid-GAN outperforms the existing methods in context-conditioned discrete-valued sequence generation.
arXiv Detail & Related papers (2020-09-18T03:52:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.