Self-conditioned Embedding Diffusion for Text Generation
- URL: http://arxiv.org/abs/2211.04236v1
- Date: Tue, 8 Nov 2022 13:30:27 GMT
- Title: Self-conditioned Embedding Diffusion for Text Generation
- Authors: Robin Strudel, Corentin Tallec, Florent Altch\'e, Yilun Du, Yaroslav
Ganin, Arthur Mensch, Will Grathwohl, Nikolay Savinov, Sander Dieleman,
Laurent Sifre, R\'emi Leblond
- Abstract summary: Self-conditioned Embedding Diffusion is a continuous diffusion mechanism that operates on token embeddings.
We show that our text diffusion models generate samples comparable with those produced by standard autoregressive language models.
- Score: 28.342735885752493
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Can continuous diffusion models bring the same performance breakthrough on
natural language they did for image generation? To circumvent the discrete
nature of text data, we can simply project tokens in a continuous space of
embeddings, as is standard in language modeling. We propose Self-conditioned
Embedding Diffusion, a continuous diffusion mechanism that operates on token
embeddings and allows to learn flexible and scalable diffusion models for both
conditional and unconditional text generation. Through qualitative and
quantitative evaluation, we show that our text diffusion models generate
samples comparable with those produced by standard autoregressive language
models - while being in theory more efficient on accelerator hardware at
inference time. Our work paves the way for scaling up diffusion models for
text, similarly to autoregressive models, and for improving performance with
recent refinements to continuous diffusion.
Related papers
- Energy-Based Diffusion Language Models for Text Generation [126.23425882687195]
Energy-based Diffusion Language Model (EDLM) is an energy-based model operating at the full sequence level for each diffusion step.
Our framework offers a 1.3$times$ sampling speedup over existing diffusion models.
arXiv Detail & Related papers (2024-10-28T17:25:56Z) - LaDiC: Are Diffusion Models Really Inferior to Autoregressive Counterparts for Image-to-Text Generation? [10.72249123249003]
We revisit diffusion models, highlighting their capacity for holistic context modeling and parallel decoding.
We introduce a novel architecture, LaDiC, which utilizes a split BERT to create a dedicated latent space for captions.
LaDiC achieves state-of-the-art performance for diffusion-based methods on the MS dataset with 38.2 BLEU@4 and 126.2 CIDEr.
arXiv Detail & Related papers (2024-04-16T17:47:16Z) - Text Diffusion with Reinforced Conditioning [92.17397504834825]
This paper thoroughly analyzes text diffusion models and uncovers two significant limitations: degradation of self-conditioning during training and misalignment between training and sampling.
Motivated by our findings, we propose a novel Text Diffusion model called TREC, which mitigates the degradation with Reinforced Conditioning and the misalignment by Time-Aware Variance Scaling.
arXiv Detail & Related papers (2024-02-19T09:24:02Z) - Steered Diffusion: A Generalized Framework for Plug-and-Play Conditional
Image Synthesis [62.07413805483241]
Steered Diffusion is a framework for zero-shot conditional image generation using a diffusion model trained for unconditional generation.
We present experiments using steered diffusion on several tasks including inpainting, colorization, text-guided semantic editing, and image super-resolution.
arXiv Detail & Related papers (2023-09-30T02:03:22Z) - PLANNER: Generating Diversified Paragraph via Latent Language Diffusion Model [37.2192243883707]
We propose PLANNER, a model that combines latent semantic diffusion with autoregressive generation to generate fluent text.
Results on semantic generation, text completion and summarization show its effectiveness in generating high-quality long-form text.
arXiv Detail & Related papers (2023-06-05T01:36:39Z) - TESS: Text-to-Text Self-Conditioned Simplex Diffusion [56.881170312435444]
Text-to-text Self-conditioned Simplex Diffusion employs a new form of self-conditioning, and applies the diffusion process on the logit simplex space rather than the learned embedding space.
We demonstrate that TESS outperforms state-of-the-art non-autoregressive models, requires fewer diffusion steps with minimal drop in performance, and is competitive with pretrained autoregressive sequence-to-sequence models.
arXiv Detail & Related papers (2023-05-15T06:33:45Z) - A Cheaper and Better Diffusion Language Model with Soft-Masked Noise [62.719656543880596]
Masked-Diffuse LM is a novel diffusion model for language modeling, inspired by linguistic features in languages.
Specifically, we design a linguistic-informed forward process which adds corruptions to the text through strategically soft-masking to better noise the textual data.
We demonstrate that our Masked-Diffuse LM can achieve better generation quality than the state-of-the-art diffusion models with better efficiency.
arXiv Detail & Related papers (2023-04-10T17:58:42Z) - A Reparameterized Discrete Diffusion Model for Text Generation [39.0145272152805]
This work studies discrete diffusion probabilistic models with applications to natural language generation.
We derive an alternative yet equivalent formulation of the sampling from discrete diffusion processes.
We conduct extensive experiments to evaluate the text generation capability of our model, demonstrating significant improvements over existing diffusion models.
arXiv Detail & Related papers (2023-02-11T16:26:57Z) - Latent Diffusion for Language Generation [26.620353485679892]
Recent attempts to adapt diffusion to language have presented diffusion as an alternative to existing language models.
We demonstrate that encoder-decoder language models can be utilized to efficiently learn high-quality language autoencoders.
We validate the effectiveness of our approach for unconditional, class-conditional, and sequence-to-sequence language generation.
arXiv Detail & Related papers (2022-12-19T13:57:06Z) - DiffusionBERT: Improving Generative Masked Language Models with
Diffusion Models [81.84866217721361]
DiffusionBERT is a new generative masked language model based on discrete diffusion models.
We propose a new noise schedule for the forward diffusion process that controls the degree of noise added at each step.
Experiments on unconditional text generation demonstrate that DiffusionBERT achieves significant improvement over existing diffusion models for text.
arXiv Detail & Related papers (2022-11-28T03:25:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.