Related papers: Critic-Driven Decoding for Mitigating Hallucinations in Data-to-text Generation

Critic-Driven Decoding for Mitigating Hallucinations in Data-to-text Generation

URL: http://arxiv.org/abs/2310.16964v1
Date: Wed, 25 Oct 2023 20:05:07 GMT
Title: Critic-Driven Decoding for Mitigating Hallucinations in Data-to-text Generation
Authors: Mateusz Lango and Ond\v{r}ej Du\v{s}ek
Abstract summary: Hallucination of text ungrounded in the input is a well-known problem in neural data-to-text generation. We propose a new way to mitigate hallucinations by combining the probabilistic output of a generator language model with the output of a special "text critic" Our method does not need any changes to the underlying LM's architecture or training procedure.
Score: 5.304395026626743
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Hallucination of text ungrounded in the input is a well-known problem in neural data-to-text generation. Many methods have been proposed to mitigate it, but they typically require altering model architecture or collecting additional data, and thus cannot be easily applied to an existing model. In this paper, we explore a new way to mitigate hallucinations by combining the probabilistic output of a generator language model (LM) with the output of a special "text critic" classifier, which guides the generation by assessing the match between the input data and the text generated so far. Our method does not need any changes to the underlying LM's architecture or training procedure and can thus be combined with any model and decoding operating on word probabilities. The critic does not need any additional training data, using the base LM's training data and synthetic negative examples. Our experimental results show that our method improves over the baseline on the WebNLG and OpenDialKG benchmarks.

Related papers

SCOPE: A Self-supervised Framework for Improving Faithfulness in Conditional Text Generation [55.61004653386632]
Large Language Models (LLMs) often produce hallucinations, i.e., information that is unfaithful or not grounded in the input context. This paper introduces a novel self-supervised method for generating a training set of unfaithful samples. We then refine the model using a training process that encourages the generation of grounded outputs over unfaithful ones.
arXiv Detail & Related papers (2025-02-19T12:31:58Z)
RegaVAE: A Retrieval-Augmented Gaussian Mixture Variational Auto-Encoder for Language Modeling [79.56442336234221]
We introduce RegaVAE, a retrieval-augmented language model built upon the variational auto-encoder (VAE) It encodes the text corpus into a latent space, capturing current and future information from both source and target text. Experimental results on various datasets demonstrate significant improvements in text generation quality and hallucination removal.
arXiv Detail & Related papers (2023-10-16T16:42:01Z)
Error Norm Truncation: Robust Training in the Presence of Data Noise for Text Generation Models [39.37532848489779]
We propose Error Norm Truncation (ENT), a robust enhancement method to the standard training objective that truncates noisy data. We show that ENT improves generation quality over standard training and previous soft and hard truncation methods.
arXiv Detail & Related papers (2023-10-02T01:30:27Z)
LeTI: Learning to Generate from Textual Interactions [60.425769582343506]
We explore LMs' potential to learn from textual interactions (LETI) that not only check their correctness with binary labels but also pinpoint and explain errors in their outputs through textual feedback. Our focus is the code generation task, where the model produces code based on natural language instructions. LETI iteratively fine-tunes the model, using the objective LM, on a concatenation of natural language instructions, LM-generated programs, and textual feedback.
arXiv Detail & Related papers (2023-05-17T15:53:31Z)
CoCo: Coherence-Enhanced Machine-Generated Text Detection Under Data Limitation With Contrastive Learning [14.637303913878435]
We present a coherence-based contrastive learning model named CoCo to detect the possible MGT under low-resource scenario. To exploit the linguistic feature, we encode coherence information in form of graph into text representation. Experiment results on two public datasets and two self-constructed datasets prove our approach outperforms the state-of-art methods significantly.
arXiv Detail & Related papers (2022-12-20T15:26:19Z)
On the Blind Spots of Model-Based Evaluation Metrics for Text Generation [79.01422521024834]
We explore a useful but often neglected methodology for robustness analysis of text generation evaluation metrics. We design and synthesize a wide range of potential errors and check whether they result in a commensurate drop in the metric scores. Our experiments reveal interesting insensitivities, biases, or even loopholes in existing metrics.
arXiv Detail & Related papers (2022-12-20T06:24:25Z)
Understanding Model Robustness to User-generated Noisy Texts [2.958690090551675]
In NLP, model performance often deteriorates with naturally occurring noise, such as spelling errors. We propose to model the errors statistically from grammatical-error-correction corpora.
arXiv Detail & Related papers (2021-10-14T14:54:52Z)
A Token-level Reference-free Hallucination Detection Benchmark for Free-form Text Generation [50.55448707570669]
We propose a novel token-level, reference-free hallucination detection task and an associated annotated dataset named HaDes. To create this dataset, we first perturb a large number of text segments extracted from English language Wikipedia, and then verify these with crowd-sourced annotations.
arXiv Detail & Related papers (2021-04-18T04:09:48Z)
Controlling Hallucinations at Word Level in Data-to-Text Generation [10.59137381324694]
State-of-art neural models include misleading statements in their outputs. We propose a Multi-Branch Decoder which is able to leverage word-level labels to learn the relevant parts of each training instance. Our model is able to reduce and control hallucinations, while keeping fluency and coherence in generated texts.
arXiv Detail & Related papers (2021-02-04T18:58:28Z)
Detecting Hallucinated Content in Conditional Neural Sequence Generation [165.68948078624499]
We propose a task to predict whether each token in the output sequence is hallucinated (not contained in the input) We also introduce a method for learning to detect hallucinations using pretrained language models fine tuned on synthetic data.
arXiv Detail & Related papers (2020-11-05T00:18:53Z)
Controlled Hallucinations: Learning to Generate Faithfully from Noisy Data [1.0914300987810126]
We present a technique to treat such hallucinations as a controllable aspect of the generated text. On the WikiBio corpus, a particularly noisy dataset, we demonstrate the efficacy of the technique both in an automatic and in a human evaluation.
arXiv Detail & Related papers (2020-10-12T17:25:02Z)
POINTER: Constrained Progressive Text Generation via Insertion-based Generative Pre-training [93.79766670391618]
We present POINTER, a novel insertion-based approach for hard-constrained text generation. The proposed method operates by progressively inserting new tokens between existing tokens in a parallel manner. The resulting coarse-to-fine hierarchy makes the generation process intuitive and interpretable.
arXiv Detail & Related papers (2020-05-01T18:11:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.