Related papers: Investigating Label Bias in Beam Search for Open-ended Text Generation

Investigating Label Bias in Beam Search for Open-ended Text Generation

URL: http://arxiv.org/abs/2005.11009v1
Date: Fri, 22 May 2020 05:17:53 GMT
Title: Investigating Label Bias in Beam Search for Open-ended Text Generation
Authors: Liang Wang, Jinlong Liu, Jingming Liu
Abstract summary: In open-ended text generation, beam search is often found to produce repetitive and generic texts. Standard seq2seq models suffer from label bias due to its locally normalized probability formulation. By combining locally normalized maximum likelihood estimation and globally normalized sequence-level training, label bias can be reduced with almost no sacrifice in perplexity.
Score: 8.331919991368366
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Beam search is an effective and widely used decoding algorithm in many sequence-to-sequence (seq2seq) text generation tasks. However, in open-ended text generation, beam search is often found to produce repetitive and generic texts, sampling-based decoding algorithms like top-k sampling and nucleus sampling are more preferred. Standard seq2seq models suffer from label bias due to its locally normalized probability formulation. This paper provides a series of empirical evidence that label bias is a major reason for such degenerate behaviors of beam search. By combining locally normalized maximum likelihood estimation and globally normalized sequence-level training, label bias can be reduced with almost no sacrifice in perplexity. To quantitatively measure label bias, we test the model's ability to discriminate the groundtruth text and a set of context-agnostic distractors. We conduct experiments on large-scale response generation datasets. Results show that beam search can produce more diverse and meaningful texts with our approach, in terms of both automatic and human evaluation metrics. Our analysis also suggests several future working directions towards the grand challenge of open-ended text generation.

Related papers

Group-Adaptive Threshold Optimization for Robust AI-Generated Text Detection [60.09665704993751]
We introduce FairOPT, an algorithm for group-specific threshold optimization in AI-generated content classifiers. Our approach partitions data into subgroups based on attributes (e.g., text length and writing style) and learns decision thresholds for each group. Our framework paves the way for more robust and fair classification criteria in AI-generated output detection.
arXiv Detail & Related papers (2025-02-06T21:58:48Z)
On the Possibilities of AI-Generated Text Detection [76.55825911221434]
We argue that as machine-generated text approximates human-like quality, the sample size needed for detection bounds increases. We test various state-of-the-art text generators, including GPT-2, GPT-3.5-Turbo, Llama, Llama-2-13B-Chat-HF, and Llama-2-70B-Chat-HF, against detectors, including oBERTa-Large/Base-Detector, GPTZero.
arXiv Detail & Related papers (2023-04-10T17:47:39Z)
Hard Nominal Example-aware Template Mutual Matching for Industrial Anomaly Detection [74.9262846410559]
textbfHard Nominal textbfExample-aware textbfTemplate textbfMutual textbfMatching (HETMM) textitHETMM aims to construct a robust prototype-based decision boundary, which can precisely distinguish between hard-nominal examples and anomalies.
arXiv Detail & Related papers (2023-03-28T17:54:56Z)
Challenges in Measuring Bias via Open-Ended Language Generation [1.5552869983952944]
We analyze how specific choices of prompt sets, metrics, automatic tools and sampling strategies affect bias results. We provide recommendations for reporting biases in open-ended language generation.
arXiv Detail & Related papers (2022-05-23T19:57:15Z)
A Call for Clarity in Beam Search: How It Works and When It Stops [125.55175954381991]
We introduce a patience factor, a simple modification to this beam decoding implementation, that generalizes the stopping criterion and provides flexibility to the depth of search. Empirical results demonstrate that adjusting this patience factor improves decoding performance of strong pretrained models on news text summarization and machine translation over diverse language pairs.
arXiv Detail & Related papers (2022-04-11T22:03:44Z)
Massive-scale Decoding for Text Generation using Lattices [34.2658286826597]
We present a search algorithm to construct lattices encoding a massive number of generation options. We show that our algorithm encodes hundreds to thousands of diverse options that remain grammatical and high-quality into one linear-sized lattice.
arXiv Detail & Related papers (2021-12-14T18:56:11Z)
Determinantal Beam Search [75.84501052642361]
Beam search is a go-to strategy for decoding neural sequence models. In use-cases that call for multiple solutions, a diverse or representative set is often desired. By posing iterations in beam search as a series of subdeterminant problems, we can turn the algorithm into a diverse subset selection process.
arXiv Detail & Related papers (2021-06-14T13:01:46Z)
A Token-level Reference-free Hallucination Detection Benchmark for Free-form Text Generation [50.55448707570669]
We propose a novel token-level, reference-free hallucination detection task and an associated annotated dataset named HaDes. To create this dataset, we first perturb a large number of text segments extracted from English language Wikipedia, and then verify these with crowd-sourced annotations.
arXiv Detail & Related papers (2021-04-18T04:09:48Z)
Controlling Hallucinations at Word Level in Data-to-Text Generation [10.59137381324694]
State-of-art neural models include misleading statements in their outputs. We propose a Multi-Branch Decoder which is able to leverage word-level labels to learn the relevant parts of each training instance. Our model is able to reduce and control hallucinations, while keeping fluency and coherence in generated texts.
arXiv Detail & Related papers (2021-02-04T18:58:28Z)
If beam search is the answer, what was the question? [78.71330480725668]
We find that beam search enforces uniform information density in text, a property motivated by cognitive science. We suggest a set of decoding objectives that explicitly enforce this property and find that exact decoding with these objectives alleviates the problems encountered when decoding poorly calibrated language generation models.
arXiv Detail & Related papers (2020-10-06T11:57:03Z)
Mirostat: A Neural Text Decoding Algorithm that Directly Controls Perplexity [22.15683400807154]
We use a theoretical analysis of perplexity in top-k, top-p, and temperature sampling to design a feedback-based adaptive top-k text decoding algorithm called mirostat. Experiments show that for low values of k and p in top-k and top-p sampling, perplexity drops significantly with generated text length. For large values of k and p, perplexity increases with generated text length, which is correlated with incoherence in the text.
arXiv Detail & Related papers (2020-07-29T17:22:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.