DiffusER: Discrete Diffusion via Edit-based Reconstruction
- URL: http://arxiv.org/abs/2210.16886v1
- Date: Sun, 30 Oct 2022 16:55:23 GMT
- Title: DiffusER: Discrete Diffusion via Edit-based Reconstruction
- Authors: Machel Reid, Vincent J. Hellendoorn, Graham Neubig
- Abstract summary: DiffusER is an edit-based generative model for text based on denoising diffusion models.
It can rival autoregressive models on several tasks spanning machine translation, summarization, and style transfer.
It can also perform other varieties of generation that standard autoregressive models are not well-suited for.
- Score: 88.62707047517914
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In text generation, models that generate text from scratch one token at a
time are currently the dominant paradigm. Despite being performant, these
models lack the ability to revise existing text, which limits their usability
in many practical scenarios. We look to address this, with DiffusER (Diffusion
via Edit-based Reconstruction), a new edit-based generative model for text
based on denoising diffusion models -- a class of models that use a Markov
chain of denoising steps to incrementally generate data. DiffusER is not only a
strong generative model in general, rivalling autoregressive models on several
tasks spanning machine translation, summarization, and style transfer; it can
also perform other varieties of generation that standard autoregressive models
are not well-suited for. For instance, we demonstrate that DiffusER makes it
possible for a user to condition generation on a prototype, or an incomplete
sequence, and continue revising based on previous edit steps.
Related papers
- Diffusion Guided Language Modeling [28.819061884362792]
For many applications it is desirable to control attributes, such as sentiment, of the generated language.
For auto-regressive language models, existing guidance methods are prone to decoding errors that cascade during generation and degrade performance.
In this paper we use a guided diffusion model to produce a latent proposal that steers an auto-regressive language model to generate text with desired properties.
arXiv Detail & Related papers (2024-08-08T05:06:22Z) - Discrete Diffusion Language Model for Long Text Summarization [19.267738861590487]
We introduce a novel semantic-aware noising process that enables Transformer backbones to handle long sequences effectively.
Our approaches achieve state-of-the-art performance on three benchmark summarization datasets: Gigaword, CNN/DailyMail, and Arxiv.
arXiv Detail & Related papers (2024-06-25T09:55:22Z) - PLANNER: Generating Diversified Paragraph via Latent Language Diffusion Model [37.2192243883707]
We propose PLANNER, a model that combines latent semantic diffusion with autoregressive generation to generate fluent text.
Results on semantic generation, text completion and summarization show its effectiveness in generating high-quality long-form text.
arXiv Detail & Related papers (2023-06-05T01:36:39Z) - Reduce, Reuse, Recycle: Compositional Generation with Energy-Based Diffusion Models and MCMC [102.64648158034568]
diffusion models have quickly become the prevailing approach to generative modeling in many domains.
We propose an energy-based parameterization of diffusion models which enables the use of new compositional operators.
We find these samplers lead to notable improvements in compositional generation across a wide set of problems.
arXiv Detail & Related papers (2023-02-22T18:48:46Z) - SeqDiffuSeq: Text Diffusion with Encoder-Decoder Transformers [50.90457644954857]
In this work, we apply diffusion models to approach sequence-to-sequence text generation.
We propose SeqDiffuSeq, a text diffusion model for sequence-to-sequence generation.
Experiment results illustrate the good performance on sequence-to-sequence generation in terms of text quality and inference time.
arXiv Detail & Related papers (2022-12-20T15:16:24Z) - Text Generation with Text-Editing Models [78.03750739936956]
This tutorial provides a comprehensive overview of text-editing models and current state-of-the-art approaches.
We discuss challenges related to productionization and how these models can be used to mitigate hallucination and bias.
arXiv Detail & Related papers (2022-06-14T17:58:17Z) - Learning to Model Editing Processes [98.11448946134894]
We propose modeling editing processes, modeling the whole process of iteratively generating sequences.
We form a conceptual framework to describe the likelihood of multi-step edits, and describe neural models that can learn a generative model of sequences based on these multistep edits.
arXiv Detail & Related papers (2022-05-24T21:32:52Z) - Text Generation with Deep Variational GAN [16.3190206770276]
We propose a GAN-based generic framework to address the problem of mode-collapse in a principled approach.
We show that our model can generate realistic text with high diversity.
arXiv Detail & Related papers (2021-04-27T21:42:13Z) - Improving Variational Autoencoder for Text Modelling with Timestep-Wise
Regularisation [18.296350505386997]
The Variational Autoencoder (VAE) is a popular and powerful model applied to text modelling to generate diverse sentences.
However, an issue known as posterior collapse (or KL loss vanishing) happens when the VAE is used in text modelling.
We propose a simple, generic architecture called Timestep-Wise Regularisation VAE (TWR-VAE) which can effectively avoid posterior collapse and can be applied to any RNN-based VAE models.
arXiv Detail & Related papers (2020-11-02T17:20:56Z) - Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors.
We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method.
Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.