Related papers: TESS: Text-to-Text Self-Conditioned Simplex Diffusion

TESS: Text-to-Text Self-Conditioned Simplex Diffusion

URL: http://arxiv.org/abs/2305.08379v2
Date: Wed, 21 Feb 2024 00:06:20 GMT
Title: TESS: Text-to-Text Self-Conditioned Simplex Diffusion
Authors: Rabeeh Karimi Mahabadi, Hamish Ivison, Jaesung Tae, James Henderson, Iz Beltagy, Matthew E. Peters, Arman Cohan
Abstract summary: Text-to-text Self-conditioned Simplex Diffusion employs a new form of self-conditioning, and applies the diffusion process on the logit simplex space rather than the learned embedding space. We demonstrate that TESS outperforms state-of-the-art non-autoregressive models, requires fewer diffusion steps with minimal drop in performance, and is competitive with pretrained autoregressive sequence-to-sequence models.
Score: 56.881170312435444
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Diffusion models have emerged as a powerful paradigm for generation, obtaining strong performance in various continuous domains. However, applying continuous diffusion models to natural language remains challenging due to its discrete nature and the need for a large number of diffusion steps to generate text, making diffusion-based generation expensive. In this work, we propose Text-to-text Self-conditioned Simplex Diffusion (TESS), a text diffusion model that is fully non-autoregressive, employs a new form of self-conditioning, and applies the diffusion process on the logit simplex space rather than the learned embedding space. Through extensive experiments on natural language understanding and generation tasks including summarization, text simplification, paraphrase generation, and question generation, we demonstrate that TESS outperforms state-of-the-art non-autoregressive models, requires fewer diffusion steps with minimal drop in performance, and is competitive with pretrained autoregressive sequence-to-sequence models. We publicly release our codebase at https://github.com/allenai/tess-diffusion.

Related papers

Flexible-length Text Infilling for Discrete Diffusion Models [0.8595835526753521]
We introduce textbfDDOT (textbfDiscrete textbfDiffusion with textbfOptimal textbfTransport Position Coupling), the first discrete diffusion model to overcome this challenge.<n>DDOT jointly denoises token values and token positions, employing a novel sample-level Optimal Transport (OT) coupling.<n>Experiments on text infilling benchmarks such as One-Billion-Word and Yelp demonstrate that DDOT outperforms naive diffusion baselines.
arXiv Detail & Related papers (2025-06-16T15:02:12Z)
Constrained Discrete Diffusion [61.81569616239755]
This paper introduces Constrained Discrete Diffusion (CDD), a novel integration of differentiable constraint optimization within the diffusion process.<n>CDD directly imposes constraints into the discrete diffusion sampling process, resulting in a training-free and effective approach.
arXiv Detail & Related papers (2025-03-12T19:48:12Z)
Generalized Interpolating Discrete Diffusion [65.74168524007484]
Masked diffusion is a popular choice due to its simplicity and effectiveness. We derive the theoretical backbone of a family of general interpolating discrete diffusion processes. Exploiting GIDD's flexibility, we explore a hybrid approach combining masking and uniform noise.
arXiv Detail & Related papers (2025-03-06T14:30:55Z)
Text Diffusion with Reinforced Conditioning [92.17397504834825]
This paper thoroughly analyzes text diffusion models and uncovers two significant limitations: degradation of self-conditioning during training and misalignment between training and sampling. Motivated by our findings, we propose a novel Text Diffusion model called TREC, which mitigates the degradation with Reinforced Conditioning and the misalignment by Time-Aware Variance Scaling.
arXiv Detail & Related papers (2024-02-19T09:24:02Z)
InfoDiffusion: Information Entropy Aware Diffusion Process for Non-Autoregressive Text Generation [33.52794666968048]
We propose InfoDiffusion, a non-autoregressive text diffusion model. Our approach introduces a "keyinfo-first" generation strategy and incorporates a noise schedule based on the amount of text information. Experimental results show that InfoDiffusion outperforms the baseline model in terms of generation quality and diversity.
arXiv Detail & Related papers (2023-10-18T14:01:39Z)
PLANNER: Generating Diversified Paragraph via Latent Language Diffusion Model [37.2192243883707]
We propose PLANNER, a model that combines latent semantic diffusion with autoregressive generation to generate fluent text. Results on semantic generation, text completion and summarization show its effectiveness in generating high-quality long-form text.
arXiv Detail & Related papers (2023-06-05T01:36:39Z)
A Cheaper and Better Diffusion Language Model with Soft-Masked Noise [62.719656543880596]
Masked-Diffuse LM is a novel diffusion model for language modeling, inspired by linguistic features in languages. Specifically, we design a linguistic-informed forward process which adds corruptions to the text through strategically soft-masking to better noise the textual data. We demonstrate that our Masked-Diffuse LM can achieve better generation quality than the state-of-the-art diffusion models with better efficiency.
arXiv Detail & Related papers (2023-04-10T17:58:42Z)
Diffusion Models for Non-autoregressive Text Generation: A Survey [94.4634088113513]
Non-autoregressive (NAR) text generation has attracted much attention in the field of natural language processing. Recently, diffusion models have been introduced into NAR text generation, showing an improved text generation quality.
arXiv Detail & Related papers (2023-03-12T05:11:09Z)
SeqDiffuSeq: Text Diffusion with Encoder-Decoder Transformers [50.90457644954857]
In this work, we apply diffusion models to approach sequence-to-sequence text generation. We propose SeqDiffuSeq, a text diffusion model for sequence-to-sequence generation. Experiment results illustrate the good performance on sequence-to-sequence generation in terms of text quality and inference time.
arXiv Detail & Related papers (2022-12-20T15:16:24Z)
Self-conditioned Embedding Diffusion for Text Generation [28.342735885752493]
Self-conditioned Embedding Diffusion is a continuous diffusion mechanism that operates on token embeddings. We show that our text diffusion models generate samples comparable with those produced by standard autoregressive language models.
arXiv Detail & Related papers (2022-11-08T13:30:27Z)
eDiffi: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers [87.52504764677226]
Large-scale diffusion-based generative models have led to breakthroughs in text-conditioned high-resolution image synthesis. We train an ensemble of text-to-image diffusion models specialized for different stages synthesis. Our ensemble of diffusion models, called eDiffi, results in improved text alignment while maintaining the same inference cost.
arXiv Detail & Related papers (2022-11-02T17:43:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.