DiffuSeq: Sequence to Sequence Text Generation with Diffusion Models
- URL: http://arxiv.org/abs/2210.08933v1
- Date: Mon, 17 Oct 2022 10:49:08 GMT
- Title: DiffuSeq: Sequence to Sequence Text Generation with Diffusion Models
- Authors: Shansan Gong and Mukai Li and Jiangtao Feng and Zhiyong Wu and
LingPeng Kong
- Abstract summary: DiffuSeq is a diffusion model designed for sequence-to-sequence (Seq2Seq) text generation tasks.
We show that DiffuSeq achieves comparable or even better performance than six established baselines.
A theoretical analysis reveals the connection between DiffuSeq and autoregressive/non-autoregressive models.
- Score: 15.913828295673705
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Recently, diffusion models have emerged as a new paradigm for generative
models. Despite the success in domains using continuous signals such as vision
and audio, adapting diffusion models to natural language is difficult due to
the discrete nature of text. We tackle this challenge by proposing DiffuSeq: a
diffusion model designed for sequence-to-sequence (Seq2Seq) text generation
tasks. Upon extensive evaluation over a wide range of Seq2Seq tasks, we find
DiffuSeq achieving comparable or even better performance than six established
baselines, including a state-of-the-art model that is based on pre-trained
language models. Apart from quality, an intriguing property of DiffuSeq is its
high diversity during generation, which is desired in many Seq2Seq tasks. We
further include a theoretical analysis revealing the connection between
DiffuSeq and autoregressive/non-autoregressive models. Bringing together
theoretical analysis and empirical evidence, we demonstrate the great potential
of diffusion models in complex conditional language generation tasks.
Related papers
- Energy-Based Diffusion Language Models for Text Generation [126.23425882687195]
Energy-based Diffusion Language Model (EDLM) is an energy-based model operating at the full sequence level for each diffusion step.
Our framework offers a 1.3$times$ sampling speedup over existing diffusion models.
arXiv Detail & Related papers (2024-10-28T17:25:56Z) - Meta-DiffuB: A Contextualized Sequence-to-Sequence Text Diffusion Model with Meta-Exploration [53.63593099509471]
We propose a scheduler-exploiter S2S-Diffusion paradigm designed to overcome the limitations of existing S2S-Diffusion models.
We employ Meta-Exploration to train an additional scheduler model dedicated to scheduling contextualized noise for each sentence.
Our exploiter model, an S2S-Diffusion model, leverages the noise scheduled by our scheduler model for updating and generation.
arXiv Detail & Related papers (2024-10-17T04:06:02Z) - Discrete Diffusion Language Model for Long Text Summarization [19.267738861590487]
We introduce a novel semantic-aware noising process that enables Transformer backbones to handle long sequences effectively.
Our approaches achieve state-of-the-art performance on three benchmark summarization datasets: Gigaword, CNN/DailyMail, and Arxiv.
arXiv Detail & Related papers (2024-06-25T09:55:22Z) - Generative Pre-training for Speech with Flow Matching [81.59952572752248]
We pre-trained a generative model, named SpeechFlow, on 60k hours of untranscribed speech with Flow Matching and masked conditions.
Experiment results show the pre-trained generative model can be fine-tuned with task-specific data to match or surpass existing expert models on speech enhancement, separation, and synthesis.
arXiv Detail & Related papers (2023-10-25T03:40:50Z) - Minimally-Supervised Speech Synthesis with Conditional Diffusion Model
and Language Model: A Comparative Study of Semantic Coding [57.42429912884543]
We propose Diff-LM-Speech, Tetra-Diff-Speech and Tri-Diff-Speech to solve high dimensionality and waveform distortion problems.
We also introduce a prompt encoder structure based on a variational autoencoder and a prosody bottleneck to improve prompt representation ability.
Experimental results show that our proposed methods outperform baseline methods.
arXiv Detail & Related papers (2023-07-28T11:20:23Z) - TESS: Text-to-Text Self-Conditioned Simplex Diffusion [56.881170312435444]
Text-to-text Self-conditioned Simplex Diffusion employs a new form of self-conditioning, and applies the diffusion process on the logit simplex space rather than the learned embedding space.
We demonstrate that TESS outperforms state-of-the-art non-autoregressive models, requires fewer diffusion steps with minimal drop in performance, and is competitive with pretrained autoregressive sequence-to-sequence models.
arXiv Detail & Related papers (2023-05-15T06:33:45Z) - SeqDiffuSeq: Text Diffusion with Encoder-Decoder Transformers [50.90457644954857]
In this work, we apply diffusion models to approach sequence-to-sequence text generation.
We propose SeqDiffuSeq, a text diffusion model for sequence-to-sequence generation.
Experiment results illustrate the good performance on sequence-to-sequence generation in terms of text quality and inference time.
arXiv Detail & Related papers (2022-12-20T15:16:24Z) - Symbolic Music Generation with Diffusion Models [4.817429789586127]
We present a technique for training diffusion models on sequential data by parameterizing the discrete domain in the continuous latent space of a pre-trained variational autoencoder.
We show strong unconditional generation and post-hoc conditional infilling results compared to autoregressive language models operating over the same continuous embeddings.
arXiv Detail & Related papers (2021-03-30T05:48:05Z) - Abstractive Summarization with Combination of Pre-trained
Sequence-to-Sequence and Saliency Models [11.420640383826656]
We investigate the effectiveness of combining saliency models that identify the important parts of the source text with pre-trained seq-to-seq models.
Most of the combination models outperformed a simple fine-tuned seq-to-seq model on both the CNN/DM and XSum datasets.
arXiv Detail & Related papers (2020-03-29T14:00:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.