FutureFill: Fast Generation from Convolutional Sequence Models
- URL: http://arxiv.org/abs/2410.03766v2
- Date: Fri, 25 Oct 2024 19:45:33 GMT
- Title: FutureFill: Fast Generation from Convolutional Sequence Models
- Authors: Naman Agarwal, Xinyi Chen, Evan Dogariu, Vlad Feinberg, Daniel Suo, Peter Bartlett, Elad Hazan,
- Abstract summary: FutureFill is a method for fast generation that applies to any sequence prediction algorithm based on convolutional operators.
Our approach reduces the generation time requirement from quadratic to quasilinear relative to the context length.
We validate our theoretical findings with experimental evidence demonstrating correctness and efficiency gains in a synthetic generation task.
- Score: 22.410028211490424
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We address the challenge of efficient auto-regressive generation in sequence prediction models by introducing FutureFill - a method for fast generation that applies to any sequence prediction algorithm based on convolutional operators. Our approach reduces the generation time requirement from quadratic to quasilinear relative to the context length. Additionally, FutureFill requires a prefill cache sized only by the number of tokens generated, which is smaller than the cache requirements for standard convolutional and attention-based models. We validate our theoretical findings with experimental evidence demonstrating correctness and efficiency gains in a synthetic generation task.
Related papers
- Sundial: A Family of Highly Capable Time Series Foundation Models [64.6322079384575]
We introduce Sundial, a family of native, flexible, and scalable time series foundation models.
Our model is pre-trained without specifying any prior distribution and can generate multiple probable predictions.
By mitigating mode collapse through TimeFlow Loss, we pre-train a family of Sundial models on TimeBench, which exhibit unprecedented model capacity and generalization performance.
arXiv Detail & Related papers (2025-02-02T14:52:50Z) - Efficient Generative Modeling with Residual Vector Quantization-Based Tokens [5.949779668853557]
ResGen is an efficient RVQ-based discrete diffusion model that generates high-fidelity samples without compromising sampling speed.
We validate the efficacy and generalizability of the proposed method on two challenging tasks: conditional image generation on ImageNet 256x256 and zero-shot text-to-speech synthesis.
As we scale the depth of RVQ, our generative models exhibit enhanced generation fidelity or faster sampling speeds compared to similarly sized baseline models.
arXiv Detail & Related papers (2024-12-13T15:31:17Z) - Enhancing Foundation Models for Time Series Forecasting via Wavelet-based Tokenization [74.3339999119713]
We develop a wavelet-based tokenizer that allows models to learn complex representations directly in the space of time-localized frequencies.
Our method first scales and decomposes the input time series, then thresholds and quantizes the wavelet coefficients, and finally pre-trains an autoregressive model to forecast coefficients for the forecast horizon.
arXiv Detail & Related papers (2024-12-06T18:22:59Z) - Timer-XL: Long-Context Transformers for Unified Time Series Forecasting [67.83502953961505]
We present Timer-XL, a generative Transformer for unified time series forecasting.
Timer-XL achieves state-of-the-art performance across challenging forecasting benchmarks through a unified approach.
arXiv Detail & Related papers (2024-10-07T07:27:39Z) - Loop Neural Networks for Parameter Sharing [1.1049608786515839]
We introduce a novel Loop Neural Network, which achieves better performance by utilizing longer computational time without increasing the model size.
Our approach revisits the input multiple times, refining the prediction by iteratively looping over a subset of the model with residual connections.
We demonstrate the effectiveness of this method through experiments comparing versions of GPT-2 with our loop models, showing improved performance in language modeling tasks while maintaining similar parameter counts.
arXiv Detail & Related papers (2024-09-21T17:07:42Z) - TX-Gen: Multi-Objective Optimization for Sparse Counterfactual Explanations for Time-Series Classification [0.42105583610914427]
We introduce TX-Gen, a novel algorithm for generating counterfactual explanations based on the Non-dominated Sorting Genetic Algorithm II (NSGA-II)
By incorporating a flexible reference-guided mechanism, our method improves the plausibility and interpretability of the counterfactuals without relying on predefined assumptions.
arXiv Detail & Related papers (2024-09-14T15:13:28Z) - Non-autoregressive Sequence-to-Sequence Vision-Language Models [63.77614880533488]
We propose a parallel decoding sequence-to-sequence vision-language model that marginalizes over multiple inference paths in the decoder.
The model achieves performance on-par with its state-of-the-art autoregressive counterpart, but is faster at inference time.
arXiv Detail & Related papers (2024-03-04T17:34:59Z) - Conditional Denoising Diffusion for Sequential Recommendation [62.127862728308045]
Two prominent generative models, Generative Adversarial Networks (GANs) and Variational AutoEncoders (VAEs)
GANs suffer from unstable optimization, while VAEs are prone to posterior collapse and over-smoothed generations.
We present a conditional denoising diffusion model, which includes a sequence encoder, a cross-attentive denoising decoder, and a step-wise diffuser.
arXiv Detail & Related papers (2023-04-22T15:32:59Z) - An EM Approach to Non-autoregressive Conditional Sequence Generation [49.11858479436565]
Autoregressive (AR) models have been the dominating approach to conditional sequence generation.
Non-autoregressive (NAR) models have been recently proposed to reduce the latency by generating all output tokens in parallel.
This paper proposes a new approach that jointly optimize both AR and NAR models in a unified Expectation-Maximization framework.
arXiv Detail & Related papers (2020-06-29T20:58:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.