TESS: Text-to-Text Self-Conditioned Simplex Diffusion
- URL: http://arxiv.org/abs/2305.08379v2
- Date: Wed, 21 Feb 2024 00:06:20 GMT
- Title: TESS: Text-to-Text Self-Conditioned Simplex Diffusion
- Authors: Rabeeh Karimi Mahabadi, Hamish Ivison, Jaesung Tae, James Henderson,
Iz Beltagy, Matthew E. Peters, Arman Cohan
- Abstract summary: Text-to-text Self-conditioned Simplex Diffusion employs a new form of self-conditioning, and applies the diffusion process on the logit simplex space rather than the learned embedding space.
We demonstrate that TESS outperforms state-of-the-art non-autoregressive models, requires fewer diffusion steps with minimal drop in performance, and is competitive with pretrained autoregressive sequence-to-sequence models.
- Score: 56.881170312435444
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Diffusion models have emerged as a powerful paradigm for generation,
obtaining strong performance in various continuous domains. However, applying
continuous diffusion models to natural language remains challenging due to its
discrete nature and the need for a large number of diffusion steps to generate
text, making diffusion-based generation expensive. In this work, we propose
Text-to-text Self-conditioned Simplex Diffusion (TESS), a text diffusion model
that is fully non-autoregressive, employs a new form of self-conditioning, and
applies the diffusion process on the logit simplex space rather than the
learned embedding space. Through extensive experiments on natural language
understanding and generation tasks including summarization, text
simplification, paraphrase generation, and question generation, we demonstrate
that TESS outperforms state-of-the-art non-autoregressive models, requires
fewer diffusion steps with minimal drop in performance, and is competitive with
pretrained autoregressive sequence-to-sequence models. We publicly release our
codebase at https://github.com/allenai/tess-diffusion.
Related papers
- Text Diffusion with Reinforced Conditioning [92.17397504834825]
This paper thoroughly analyzes text diffusion models and uncovers two significant limitations: degradation of self-conditioning during training and misalignment between training and sampling.
Motivated by our findings, we propose a novel Text Diffusion model called TREC, which mitigates the degradation with Reinforced Conditioning and the misalignment by Time-Aware Variance Scaling.
arXiv Detail & Related papers (2024-02-19T09:24:02Z) - InfoDiffusion: Information Entropy Aware Diffusion Process for
Non-Autoregressive Text Generation [33.52794666968048]
We propose InfoDiffusion, a non-autoregressive text diffusion model.
Our approach introduces a "keyinfo-first" generation strategy and incorporates a noise schedule based on the amount of text information.
Experimental results show that InfoDiffusion outperforms the baseline model in terms of generation quality and diversity.
arXiv Detail & Related papers (2023-10-18T14:01:39Z) - PLANNER: Generating Diversified Paragraph via Latent Language Diffusion Model [37.2192243883707]
We propose PLANNER, a model that combines latent semantic diffusion with autoregressive generation to generate fluent text.
Results on semantic generation, text completion and summarization show its effectiveness in generating high-quality long-form text.
arXiv Detail & Related papers (2023-06-05T01:36:39Z) - A Cheaper and Better Diffusion Language Model with Soft-Masked Noise [62.719656543880596]
Masked-Diffuse LM is a novel diffusion model for language modeling, inspired by linguistic features in languages.
Specifically, we design a linguistic-informed forward process which adds corruptions to the text through strategically soft-masking to better noise the textual data.
We demonstrate that our Masked-Diffuse LM can achieve better generation quality than the state-of-the-art diffusion models with better efficiency.
arXiv Detail & Related papers (2023-04-10T17:58:42Z) - Diffusion Models for Non-autoregressive Text Generation: A Survey [94.4634088113513]
Non-autoregressive (NAR) text generation has attracted much attention in the field of natural language processing.
Recently, diffusion models have been introduced into NAR text generation, showing an improved text generation quality.
arXiv Detail & Related papers (2023-03-12T05:11:09Z) - SeqDiffuSeq: Text Diffusion with Encoder-Decoder Transformers [50.90457644954857]
In this work, we apply diffusion models to approach sequence-to-sequence text generation.
We propose SeqDiffuSeq, a text diffusion model for sequence-to-sequence generation.
Experiment results illustrate the good performance on sequence-to-sequence generation in terms of text quality and inference time.
arXiv Detail & Related papers (2022-12-20T15:16:24Z) - Self-conditioned Embedding Diffusion for Text Generation [28.342735885752493]
Self-conditioned Embedding Diffusion is a continuous diffusion mechanism that operates on token embeddings.
We show that our text diffusion models generate samples comparable with those produced by standard autoregressive language models.
arXiv Detail & Related papers (2022-11-08T13:30:27Z) - eDiffi: Text-to-Image Diffusion Models with an Ensemble of Expert
Denoisers [87.52504764677226]
Large-scale diffusion-based generative models have led to breakthroughs in text-conditioned high-resolution image synthesis.
We train an ensemble of text-to-image diffusion models specialized for different stages synthesis.
Our ensemble of diffusion models, called eDiffi, results in improved text alignment while maintaining the same inference cost.
arXiv Detail & Related papers (2022-11-02T17:43:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.