A Contrastive Framework for Neural Text Generation
- URL: http://arxiv.org/abs/2202.06417v1
- Date: Sun, 13 Feb 2022 21:46:14 GMT
- Title: A Contrastive Framework for Neural Text Generation
- Authors: Yixuan Su and Tian Lan and Yan Wang and Dani Yogatama and Lingpeng
Kong and Nigel Collier
- Abstract summary: We show that an underlying reason for model degeneration is the anisotropic distribution of token representations.
We present a contrastive solution: (i) SimCTG, a contrastive training objective to calibrate the model's representation space, and (ii) a decoding method -- contrastive search -- to encourage diversity while maintaining coherence in the generated text.
- Score: 46.845997620234265
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Text generation is of great importance to many natural language processing
applications. However, maximization-based decoding methods (e.g. beam search)
of neural language models often lead to degenerate solutions -- the generated
text is unnatural and contains undesirable repetitions. Existing approaches
introduce stochasticity via sampling or modify training objectives to decrease
probabilities of certain tokens (e.g., unlikelihood training). However, they
often lead to solutions that lack coherence. In this work, we show that an
underlying reason for model degeneration is the anisotropic distribution of
token representations. We present a contrastive solution: (i) SimCTG, a
contrastive training objective to calibrate the model's representation space,
and (ii) a decoding method -- contrastive search -- to encourage diversity
while maintaining coherence in the generated text. Extensive experiments and
analyses on three benchmarks from two languages demonstrate that our proposed
approach outperforms state-of-the-art text generation methods as evaluated by
both human and automatic metrics.
Related papers
- Detecting Machine-Generated Long-Form Content with Latent-Space Variables [54.07946647012579]
Existing zero-shot detectors primarily focus on token-level distributions, which are vulnerable to real-world domain shifts.
We propose a more robust method that incorporates abstract elements, such as event transitions, as key deciding factors to detect machine versus human texts.
arXiv Detail & Related papers (2024-10-04T18:42:09Z) - Adaptive Contrastive Search: Uncertainty-Guided Decoding for Open-Ended Text Generation [0.20971479389679337]
We introduce adaptive contrastive search, a novel decoding strategy extending contrastive search.
Our findings indicate performance enhancement in both aspects, across different model architectures and datasets.
arXiv Detail & Related papers (2024-07-26T12:23:54Z) - Vector-Quantized Prompt Learning for Paraphrase Generation [18.40940464497253]
This paper proposes to generate diverse and high-quality paraphrases by exploiting the pre-trained models with instance-dependent prompts.
Extensive experiments demonstrate that the proposed method achieves new state-of-art results on three benchmark datasets.
arXiv Detail & Related papers (2023-11-25T07:13:06Z) - Language Model Decoding as Direct Metrics Optimization [87.68281625776282]
Current decoding methods struggle to generate texts that align with human texts across different aspects.
In this work, we frame decoding from a language model as an optimization problem with the goal of strictly matching the expected performance with human texts.
We prove that this induced distribution is guaranteed to improve the perplexity on human texts, which suggests a better approximation to the underlying distribution of human texts.
arXiv Detail & Related papers (2023-10-02T09:35:27Z) - A Latent-Variable Model for Intrinsic Probing [93.62808331764072]
We propose a novel latent-variable formulation for constructing intrinsic probes.
We find empirical evidence that pre-trained representations develop a cross-lingually entangled notion of morphosyntax.
arXiv Detail & Related papers (2022-01-20T15:01:12Z) - Language Model Evaluation in Open-ended Text Generation [0.76146285961466]
We study different evaluation metrics that have been proposed to evaluate quality, diversity and consistency of machine-generated text.
From there, we propose a practical pipeline to evaluate language models in open-ended generation task.
arXiv Detail & Related papers (2021-08-08T06:16:02Z) - Neural Text Generation with Part-of-Speech Guided Softmax [82.63394952538292]
We propose using linguistic annotation, i.e., part-of-speech (POS), to guide the text generation.
We show that our proposed methods can generate more diverse text while maintaining comparable quality.
arXiv Detail & Related papers (2021-05-08T08:53:16Z) - Informed Sampling for Diversity in Concept-to-Text NLG [8.883733362171034]
We propose an Imitation Learning approach to explore the level of diversity that a language generation model can reliably produce.
Specifically, we augment the decoding process with a meta-classifier trained to distinguish which words at any given timestep will lead to high-quality output.
arXiv Detail & Related papers (2020-04-29T17:43:24Z) - Improve Variational Autoencoder for Text Generationwith Discrete Latent
Bottleneck [52.08901549360262]
Variational autoencoders (VAEs) are essential tools in end-to-end representation learning.
VAEs tend to ignore latent variables with a strong auto-regressive decoder.
We propose a principled approach to enforce an implicit latent feature matching in a more compact latent space.
arXiv Detail & Related papers (2020-04-22T14:41:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.