Diverse Text Generation via Variational Encoder-Decoder Models with
Gaussian Process Priors
- URL: http://arxiv.org/abs/2204.01227v1
- Date: Mon, 4 Apr 2022 04:09:15 GMT
- Title: Diverse Text Generation via Variational Encoder-Decoder Models with
Gaussian Process Priors
- Authors: Wanyu Du, Jianqiao Zhao, Liwei Wang, Yangfeng Ji
- Abstract summary: We present a novel latent structured variable model to generate high quality texts.
Specifically, we introduce a function to map deterministic encoder hidden states into random context variables.
To address the learning challenge of Gaussian processes, we propose an efficient variational inference approach.
- Score: 21.71928935339393
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generating high quality texts with high diversity is important for many NLG
applications, but current methods mostly focus on building deterministic models
to generate higher quality texts and do not provide many options for promoting
diversity. In this work, we present a novel latent structured variable model to
generate high quality texts by enriching contextual representation learning of
encoder-decoder models. Specifically, we introduce a stochastic function to map
deterministic encoder hidden states into random context variables. The proposed
stochastic function is sampled from a Gaussian process prior to (1) provide
infinite number of joint Gaussian distributions of random context variables
(diversity-promoting) and (2) explicitly model dependency between context
variables (accurate-encoding). To address the learning challenge of Gaussian
processes, we propose an efficient variational inference approach to
approximate the posterior distribution of random context variables. We evaluate
our method in two typical text generation tasks: paraphrase generation and text
style transfer. Experimental results on benchmark datasets demonstrate that our
method improves the generation quality and diversity compared with other
baselines.
Related papers
- Balancing Diversity and Risk in LLM Sampling: How to Select Your Method and Parameter for Open-Ended Text Generation [60.493180081319785]
We propose a systematic way to estimate the intrinsic capacity of a truncation sampling method by considering the trade-off between diversity and risk at each decoding step.
Our work provides a comprehensive comparison between existing truncation sampling methods, as well as their recommended parameters as a guideline for users.
arXiv Detail & Related papers (2024-08-24T14:14:32Z) - Adaptive Contrastive Search: Uncertainty-Guided Decoding for Open-Ended Text Generation [0.20971479389679337]
We introduce adaptive contrastive search, a novel decoding strategy extending contrastive search.
Our findings indicate performance enhancement in both aspects, across different model architectures and datasets.
arXiv Detail & Related papers (2024-07-26T12:23:54Z) - RegaVAE: A Retrieval-Augmented Gaussian Mixture Variational Auto-Encoder
for Language Modeling [79.56442336234221]
We introduce RegaVAE, a retrieval-augmented language model built upon the variational auto-encoder (VAE)
It encodes the text corpus into a latent space, capturing current and future information from both source and target text.
Experimental results on various datasets demonstrate significant improvements in text generation quality and hallucination removal.
arXiv Detail & Related papers (2023-10-16T16:42:01Z) - Language Model Decoding as Direct Metrics Optimization [87.68281625776282]
Current decoding methods struggle to generate texts that align with human texts across different aspects.
In this work, we frame decoding from a language model as an optimization problem with the goal of strictly matching the expected performance with human texts.
We prove that this induced distribution is guaranteed to improve the perplexity on human texts, which suggests a better approximation to the underlying distribution of human texts.
arXiv Detail & Related papers (2023-10-02T09:35:27Z) - SeqDiffuSeq: Text Diffusion with Encoder-Decoder Transformers [50.90457644954857]
In this work, we apply diffusion models to approach sequence-to-sequence text generation.
We propose SeqDiffuSeq, a text diffusion model for sequence-to-sequence generation.
Experiment results illustrate the good performance on sequence-to-sequence generation in terms of text quality and inference time.
arXiv Detail & Related papers (2022-12-20T15:16:24Z) - A Contrastive Framework for Neural Text Generation [46.845997620234265]
We show that an underlying reason for model degeneration is the anisotropic distribution of token representations.
We present a contrastive solution: (i) SimCTG, a contrastive training objective to calibrate the model's representation space, and (ii) a decoding method -- contrastive search -- to encourage diversity while maintaining coherence in the generated text.
arXiv Detail & Related papers (2022-02-13T21:46:14Z) - HETFORMER: Heterogeneous Transformer with Sparse Attention for Long-Text
Extractive Summarization [57.798070356553936]
HETFORMER is a Transformer-based pre-trained model with multi-granularity sparse attentions for extractive summarization.
Experiments on both single- and multi-document summarization tasks show that HETFORMER achieves state-of-the-art performance in Rouge F1.
arXiv Detail & Related papers (2021-10-12T22:42:31Z) - Disentangling Generative Factors in Natural Language with Discrete
Variational Autoencoders [0.0]
We argue that continuous variables may not be ideal to model features of textual data, due to the fact that most generative factors in text are discrete.
We propose a Variational Autoencoder based method which models language features as discrete variables and encourages independence between variables for learning disentangled representations.
arXiv Detail & Related papers (2021-09-15T09:10:05Z) - Generating Diverse Descriptions from Semantic Graphs [38.28044884015192]
We present a graph-to-text model, incorporating a latent variable in an an-decoder model, and its use in an ensemble.
We show an ensemble of models produces diverse sets of generated sentences, while retaining similar quality to state-of-the-art models.
We evaluate the models on WebNLG datasets in English and Russian, and show an ensemble of models produces diverse sets of generated sentences, while retaining similar quality to state-of-the-art models.
arXiv Detail & Related papers (2021-08-12T11:00:09Z) - Informed Sampling for Diversity in Concept-to-Text NLG [8.883733362171034]
We propose an Imitation Learning approach to explore the level of diversity that a language generation model can reliably produce.
Specifically, we augment the decoding process with a meta-classifier trained to distinguish which words at any given timestep will lead to high-quality output.
arXiv Detail & Related papers (2020-04-29T17:43:24Z) - Improve Variational Autoencoder for Text Generationwith Discrete Latent
Bottleneck [52.08901549360262]
Variational autoencoders (VAEs) are essential tools in end-to-end representation learning.
VAEs tend to ignore latent variables with a strong auto-regressive decoder.
We propose a principled approach to enforce an implicit latent feature matching in a more compact latent space.
arXiv Detail & Related papers (2020-04-22T14:41:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.