Which Discriminator for Cooperative Text Generation?
- URL: http://arxiv.org/abs/2204.11586v1
- Date: Mon, 25 Apr 2022 12:16:02 GMT
- Title: Which Discriminator for Cooperative Text Generation?
- Authors: Antoine Chaffin, Thomas Scialom, Sylvain Lamprier, Jacopo Staiano,
Benjamin Piwowarski, Ewa Kijak, Vincent Claveau
- Abstract summary: A growing field of interest tries to leverage external information in the decoding process so that the generated texts have desired properties.
A solution is to use a classifier at each generation step, resulting in a cooperative environment.
In this paper, we examine three families of (transformer-based) discriminators for this specific task of cooperative decoding.
- Score: 25.090455367573988
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Language models generate texts by successively predicting probability
distributions for next tokens given past ones. A growing field of interest
tries to leverage external information in the decoding process so that the
generated texts have desired properties, such as being more natural, non toxic,
faithful, or having a specific writing style. A solution is to use a classifier
at each generation step, resulting in a cooperative environment where the
classifier guides the decoding of the language model distribution towards
relevant texts for the task at hand. In this paper, we examine three families
of (transformer-based) discriminators for this specific task of cooperative
decoding: bidirectional, left-to-right and generative ones. We evaluate the
pros and cons of these different types of discriminators for cooperative
generation, exploring respective accuracy on classification tasks along with
their impact on the resulting sample quality and computational performances. We
also provide the code of a batched implementation of the powerful cooperative
decoding strategy used for our experiments, the Monte Carlo Tree Search,
working with each discriminator for Natural Language Generation.
Related papers
- Adaptive Contrastive Search: Uncertainty-Guided Decoding for Open-Ended Text Generation [0.20971479389679337]
We introduce adaptive contrastive search, a novel decoding strategy extending contrastive search.
Our findings indicate performance enhancement in both aspects, across different model architectures and datasets.
arXiv Detail & Related papers (2024-07-26T12:23:54Z) - Discriminating Human-authored from ChatGPT-Generated Code Via
Discernable Feature Analysis [2.9398911304923447]
This paper specifically aims to distinguish code generated by ChatGPT from that authored by humans.
We devise a dataset cleansing technique, which employs temporal and spatial segmentation, to mitigate the dearth of datasets.
To further enrich data resources, we employ "code transformation," "feature transformation," and "feature customization" techniques, generating an extensive dataset comprising 10,000 lines of ChatGPT-generated code.
arXiv Detail & Related papers (2023-06-26T03:15:06Z) - Beyond Contrastive Learning: A Variational Generative Model for
Multilingual Retrieval [109.62363167257664]
We propose a generative model for learning multilingual text embeddings.
Our model operates on parallel data in $N$ languages.
We evaluate this method on a suite of tasks including semantic similarity, bitext mining, and cross-lingual question retrieval.
arXiv Detail & Related papers (2022-12-21T02:41:40Z) - Improving Multi-task Generalization Ability for Neural Text Matching via
Prompt Learning [54.66399120084227]
Recent state-of-the-art neural text matching models (PLMs) are hard to generalize to different tasks.
We adopt a specialization-generalization training strategy and refer to it as Match-Prompt.
In specialization stage, descriptions of different matching tasks are mapped to only a few prompt tokens.
In generalization stage, text matching model explores the essential matching signals by being trained on diverse multiple matching tasks.
arXiv Detail & Related papers (2022-04-06T11:01:08Z) - On Decoding Strategies for Neural Text Generators [73.48162198041884]
We study the interaction between language generation tasks and decoding strategies.
We measure changes in attributes of generated text as a function of both decoding strategy and task.
Our results reveal both previously-observed and surprising findings.
arXiv Detail & Related papers (2022-03-29T16:25:30Z) - Massive-scale Decoding for Text Generation using Lattices [34.2658286826597]
We present a search algorithm to construct lattices encoding a massive number of generation options.
We show that our algorithm encodes hundreds to thousands of diverse options that remain grammatical and high-quality into one linear-sized lattice.
arXiv Detail & Related papers (2021-12-14T18:56:11Z) - Compression, Transduction, and Creation: A Unified Framework for
Evaluating Natural Language Generation [85.32991360774447]
Natural language generation (NLG) spans a broad range of tasks, each of which serves for specific objectives.
We propose a unifying perspective based on the nature of information change in NLG tasks.
We develop a family of interpretable metrics that are suitable for evaluating key aspects of different NLG tasks.
arXiv Detail & Related papers (2021-09-14T01:00:42Z) - Lexically-constrained Text Generation through Commonsense Knowledge
Extraction and Injection [62.071938098215085]
We focus on the Commongen benchmark, wherein the aim is to generate a plausible sentence for a given set of input concepts.
We propose strategies for enhancing the semantic correctness of the generated text.
arXiv Detail & Related papers (2020-12-19T23:23:40Z) - Evidence-Aware Inferential Text Generation with Vector Quantised
Variational AutoEncoder [104.25716317141321]
We propose an approach that automatically finds evidence for an event from a large text corpus, and leverages the evidence to guide the generation of inferential texts.
Our approach provides state-of-the-art performance on both Event2Mind and ATOMIC datasets.
arXiv Detail & Related papers (2020-06-15T02:59:52Z) - Leveraging Code Generation to Improve Code Retrieval and Summarization
via Dual Learning [18.354352985591305]
Code summarization generates brief natural language description given a source code snippet, while code retrieval fetches relevant source code given a natural language query.
Recent studies have combined these two tasks to improve their performance.
We propose a novel end-to-end model for the two tasks by introducing an additional code generation task.
arXiv Detail & Related papers (2020-02-24T12:26:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.