Factuality Enhanced Language Models for Open-Ended Text Generation
- URL: http://arxiv.org/abs/2206.04624v2
- Date: Sat, 22 Oct 2022 06:33:06 GMT
- Title: Factuality Enhanced Language Models for Open-Ended Text Generation
- Authors: Nayeon Lee, Wei Ping, Peng Xu, Mostofa Patwary, Pascale Fung, Mohammad
Shoeybi, Bryan Catanzaro
- Abstract summary: We design the FactualityPrompts test set and metrics to measure the factuality of LM generations.
We find that larger LMs are more factual than smaller ones, although a previous study suggests that larger LMs can be less truthful in terms of misconceptions.
We propose a factuality-enhanced training method that uses TopicPrefix for better awareness of facts and sentence completion.
- Score: 60.27166549575472
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Pretrained language models (LMs) are susceptible to generate text with
nonfactual information. In this work, we measure and improve the factual
accuracy of large-scale LMs for open-ended text generation. We design the
FactualityPrompts test set and metrics to measure the factuality of LM
generations. Based on that, we study the factual accuracy of LMs with parameter
sizes ranging from 126M to 530B. Interestingly, we find that larger LMs are
more factual than smaller ones, although a previous study suggests that larger
LMs can be less truthful in terms of misconceptions. In addition, popular
sampling algorithms (e.g., top-p) in open-ended text generation can harm the
factuality due to the ''uniform randomness'' introduced at every sampling step.
We propose the factual-nucleus sampling algorithm that dynamically adapts the
randomness to improve the factuality of generation while maintaining quality.
Furthermore, we analyze the inefficiencies of the standard training method in
learning correct associations between entities from factual text corpus (e.g.,
Wikipedia). We propose a factuality-enhanced training method that uses
TopicPrefix for better awareness of facts and sentence completion as the
training objective, which can vastly reduce the factual errors.
Related papers
- FactAlign: Long-form Factuality Alignment of Large Language Models [35.067998820937284]
Large language models have demonstrated significant potential as the next-generation information access engines.
We propose FactAlign, a novel alignment framework designed to enhance the factuality of long-form responses.
Our experiments on open-domain prompts and information-seeking questions demonstrate that FactAlign significantly improves the factual accuracy of LLM responses.
arXiv Detail & Related papers (2024-10-02T16:03:13Z) - Know When To Stop: A Study of Semantic Drift in Text Generation [9.76171773410722]
We show that modern LLMs tend to generate correct facts first, then "drift away" and generate incorrect facts later.
This correct-then-incorrect generation pattern suggests that factual accuracy can be improved by knowing when to stop generation.
arXiv Detail & Related papers (2024-04-08T11:25:30Z) - Fine-tuning Language Models for Factuality [96.5203774943198]
Large pre-trained language models (LLMs) have led to their widespread use, sometimes even as a replacement for traditional search engines.
Yet language models are prone to making convincing but factually inaccurate claims, often referred to as 'hallucinations'
In this work, we fine-tune language models to be more factual, without human labeling.
arXiv Detail & Related papers (2023-11-14T18:59:15Z) - Improving Factual Consistency of Text Summarization by Adversarially
Decoupling Comprehension and Embellishment Abilities of LLMs [67.56087611675606]
Large language models (LLMs) generate summaries that are factually inconsistent with original articles.
These hallucinations are challenging to detect through traditional methods.
We propose an adversarially DEcoupling method to disentangle the abilities of LLMs (DECENT)
arXiv Detail & Related papers (2023-10-30T08:40:16Z) - RegaVAE: A Retrieval-Augmented Gaussian Mixture Variational Auto-Encoder
for Language Modeling [79.56442336234221]
We introduce RegaVAE, a retrieval-augmented language model built upon the variational auto-encoder (VAE)
It encodes the text corpus into a latent space, capturing current and future information from both source and target text.
Experimental results on various datasets demonstrate significant improvements in text generation quality and hallucination removal.
arXiv Detail & Related papers (2023-10-16T16:42:01Z) - LeTI: Learning to Generate from Textual Interactions [60.425769582343506]
We explore LMs' potential to learn from textual interactions (LETI) that not only check their correctness with binary labels but also pinpoint and explain errors in their outputs through textual feedback.
Our focus is the code generation task, where the model produces code based on natural language instructions.
LETI iteratively fine-tunes the model, using the objective LM, on a concatenation of natural language instructions, LM-generated programs, and textual feedback.
arXiv Detail & Related papers (2023-05-17T15:53:31Z) - An Interpretability Evaluation Benchmark for Pre-trained Language Models [37.16893581395874]
We propose a novel evaluation benchmark providing with both English and Chinese annotated data.
It tests LMs abilities in multiple dimensions, i.e., grammar, semantics, knowledge, reasoning and computation.
It contains perturbed instances for each original instance, so as to use the rationale consistency under perturbations as the metric for faithfulness.
arXiv Detail & Related papers (2022-07-28T08:28:09Z) - Improving Text Generation with Student-Forcing Optimal Transport [122.11881937642401]
We propose using optimal transport (OT) to match the sequences generated in training and testing modes.
An extension is also proposed to improve the OT learning, based on the structural and contextual information of the text sequences.
The effectiveness of the proposed method is validated on machine translation, text summarization, and text generation tasks.
arXiv Detail & Related papers (2020-10-12T19:42:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.