Step-unrolled Denoising Autoencoders for Text Generation
- URL: http://arxiv.org/abs/2112.06749v1
- Date: Mon, 13 Dec 2021 16:00:33 GMT
- Title: Step-unrolled Denoising Autoencoders for Text Generation
- Authors: Nikolay Savinov, Junyoung Chung, Mikolaj Binkowski, Erich Elsen, Aaron
van den Oord
- Abstract summary: We propose a new generative model of text, Step-unrolled Denoising Autoencoder (SUNDAE)
SUNDAE is repeatedly applied on a sequence of tokens, starting from random inputs and improving them each time until convergence.
We present a simple new improvement operator that converges in fewer iterations than diffusion methods.
- Score: 17.015573262373742
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper we propose a new generative model of text, Step-unrolled
Denoising Autoencoder (SUNDAE), that does not rely on autoregressive models.
Similarly to denoising diffusion techniques, SUNDAE is repeatedly applied on a
sequence of tokens, starting from random inputs and improving them each time
until convergence. We present a simple new improvement operator that converges
in fewer iterations than diffusion methods, while qualitatively producing
better samples on natural language datasets. SUNDAE achieves state-of-the-art
results (among non-autoregressive methods) on the WMT'14 English-to-German
translation task and good qualitative results on unconditional language
modeling on the Colossal Cleaned Common Crawl dataset and a dataset of Python
code from GitHub. The non-autoregressive nature of SUNDAE opens up
possibilities beyond left-to-right prompted generation, by filling in arbitrary
blank patterns in a template.
Related papers
- GEC-DePenD: Non-Autoregressive Grammatical Error Correction with
Decoupled Permutation and Decoding [52.14832976759585]
Grammatical error correction (GEC) is an important NLP task that is usually solved with autoregressive sequence-to-sequence models.
We propose a novel non-autoregressive approach to GEC that decouples the architecture into a permutation network.
We show that the resulting network improves over previously known non-autoregressive methods for GEC.
arXiv Detail & Related papers (2023-11-14T14:24:36Z) - Critic-Driven Decoding for Mitigating Hallucinations in Data-to-text
Generation [5.304395026626743]
Hallucination of text ungrounded in the input is a well-known problem in neural data-to-text generation.
We propose a new way to mitigate hallucinations by combining the probabilistic output of a generator language model with the output of a special "text critic"
Our method does not need any changes to the underlying LM's architecture or training procedure.
arXiv Detail & Related papers (2023-10-25T20:05:07Z) - PLANNER: Generating Diversified Paragraph via Latent Language Diffusion Model [37.2192243883707]
We propose PLANNER, a model that combines latent semantic diffusion with autoregressive generation to generate fluent text.
Results on semantic generation, text completion and summarization show its effectiveness in generating high-quality long-form text.
arXiv Detail & Related papers (2023-06-05T01:36:39Z) - SeqDiffuSeq: Text Diffusion with Encoder-Decoder Transformers [50.90457644954857]
In this work, we apply diffusion models to approach sequence-to-sequence text generation.
We propose SeqDiffuSeq, a text diffusion model for sequence-to-sequence generation.
Experiment results illustrate the good performance on sequence-to-sequence generation in terms of text quality and inference time.
arXiv Detail & Related papers (2022-12-20T15:16:24Z) - DiffusER: Discrete Diffusion via Edit-based Reconstruction [88.62707047517914]
DiffusER is an edit-based generative model for text based on denoising diffusion models.
It can rival autoregressive models on several tasks spanning machine translation, summarization, and style transfer.
It can also perform other varieties of generation that standard autoregressive models are not well-suited for.
arXiv Detail & Related papers (2022-10-30T16:55:23Z) - Thutmose Tagger: Single-pass neural model for Inverse Text Normalization [76.87664008338317]
Inverse text normalization (ITN) is an essential post-processing step in automatic speech recognition.
We present a dataset preparation method based on the granular alignment of ITN examples.
One-to-one correspondence between tags and input words improves the interpretability of the model's predictions.
arXiv Detail & Related papers (2022-07-29T20:39:02Z) - Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.
Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking.
We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z) - POINTER: Constrained Progressive Text Generation via Insertion-based
Generative Pre-training [93.79766670391618]
We present POINTER, a novel insertion-based approach for hard-constrained text generation.
The proposed method operates by progressively inserting new tokens between existing tokens in a parallel manner.
The resulting coarse-to-fine hierarchy makes the generation process intuitive and interpretable.
arXiv Detail & Related papers (2020-05-01T18:11:54Z) - Contextual Text Denoising with Masked Language Models [21.923035129334373]
We propose a new contextual text denoising algorithm based on the ready-to-use masked language model.
The proposed algorithm does not require retraining of the model and can be integrated into any NLP system.
arXiv Detail & Related papers (2019-10-30T18:47:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.