Context-aware Decoding Reduces Hallucination in Query-focused
Summarization
- URL: http://arxiv.org/abs/2312.14335v2
- Date: Sun, 31 Dec 2023 22:31:48 GMT
- Title: Context-aware Decoding Reduces Hallucination in Query-focused
Summarization
- Authors: Zhichao Xu
- Abstract summary: We conduct a large-scale study on one recently proposed decoding method -- Context-aware Decoding (CAD)
Experiments with eight different language models show that performance-wise, CAD improves QFS quality by reducing factuality errors/hallucinations.
The code implementation based on Huggingface Library is made available.
- Score: 2.8554857235549753
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Query-focused summarization (QFS) aims to provide a summary of a single
document/multi documents that can satisfy the information needs of a given
query. It is useful for various real-world applications, such as abstractive
snippet generation or more recent retrieval augmented generation (RAG). A
prototypical QFS pipeline consists of a retriever (sparse or dense retrieval)
and a generator (usually a large language model). However, applying large
language models (LLM) potentially leads to hallucinations, especially when the
evidence contradicts the prior belief of LLMs. There has been growing interest
in developing new decoding methods to improve generation quality and reduce
hallucination. In this work, we conduct a large-scale reproducibility study on
one recently proposed decoding method -- Context-aware Decoding (CAD). In
addition to replicating CAD's experiments on news summarization datasets, we
include experiments on QFS datasets, and conduct more rigorous analysis on
computational complexity and hyperparameter sensitivity. Experiments with eight
different language models show that performance-wise, CAD improves QFS quality
by (1) reducing factuality errors/hallucinations while (2) mostly retaining the
match of lexical patterns, measured by ROUGE scores, while also at a cost of
increased inference-time FLOPs and reduced decoding speed. The code
implementation based on Huggingface Library is made available
https://github.com/zhichaoxu-shufe/context-aware-decoding-qfs
Related papers
- BRIEF: Bridging Retrieval and Inference for Multi-hop Reasoning via Compression [91.23933111083389]
BRIEF (Bridging Retrieval and Inference through Evidence Fusion) is a lightweight approach that performs query-aware multi-hop reasoning.
Based on our synthetic data built entirely by open-source models, BRIEF generates more concise summaries.
arXiv Detail & Related papers (2024-10-20T04:24:16Z) - LargePiG: Your Large Language Model is Secretly a Pointer Generator [15.248956952849259]
We introduce relevance hallucination and factuality hallucination as a new typology for hallucination problems brought by query generation based on Large Language Models (LLMs)
We propose an effective way to separate content from form in LLM-generated queries, which preserves the factual knowledge extracted and integrated from the inputs and compiles the syntactic structure, including function words, using the powerful linguistic capabilities of the LLM.
arXiv Detail & Related papers (2024-10-15T07:41:40Z) - LongHalQA: Long-Context Hallucination Evaluation for MultiModal Large Language Models [96.64960606650115]
LongHalQA is an LLM-free hallucination benchmark that comprises 6K long and complex hallucination text.
LongHalQA is featured by GPT4V-generated hallucinatory data that are well aligned with real-world scenarios.
arXiv Detail & Related papers (2024-10-13T18:59:58Z) - Benchmarking Uncertainty Quantification Methods for Large Language Models with LM-Polygraph [83.90988015005934]
Uncertainty quantification (UQ) is a critical component of machine learning (ML) applications.
We introduce a novel benchmark that implements a collection of state-of-the-art UQ baselines.
We conduct a large-scale empirical investigation of UQ and normalization techniques across nine tasks, and identify the most promising approaches.
arXiv Detail & Related papers (2024-06-21T20:06:31Z) - RQ-RAG: Learning to Refine Queries for Retrieval Augmented Generation [42.82192656794179]
Large Language Models (LLMs) exhibit remarkable capabilities but are prone to generating inaccurate or hallucinatory responses.
This limitation stems from their reliance on vast pretraining datasets, making them susceptible to errors in unseen scenarios.
Retrieval-Augmented Generation (RAG) addresses this by incorporating external, relevant documents into the response generation process.
arXiv Detail & Related papers (2024-03-31T08:58:54Z) - RegaVAE: A Retrieval-Augmented Gaussian Mixture Variational Auto-Encoder
for Language Modeling [79.56442336234221]
We introduce RegaVAE, a retrieval-augmented language model built upon the variational auto-encoder (VAE)
It encodes the text corpus into a latent space, capturing current and future information from both source and target text.
Experimental results on various datasets demonstrate significant improvements in text generation quality and hallucination removal.
arXiv Detail & Related papers (2023-10-16T16:42:01Z) - In-context Autoencoder for Context Compression in a Large Language Model [70.7621953091318]
We propose the In-context Autoencoder (ICAE) to compress a long context into short compact memory slots.
ICAE is first pretrained using both autoencoding and language modeling objectives on massive text data.
arXiv Detail & Related papers (2023-07-13T17:59:21Z) - RLTF: Reinforcement Learning from Unit Test Feedback [17.35361167578498]
Reinforcement Learning from Unit Test Feedback is a novel online RL framework with unit test feedback of multi-granularity for refining code LLMs.
Our approach generates data in real-time during training and simultaneously utilizes fine-grained feedback signals to guide the model towards producing higher-quality code.
arXiv Detail & Related papers (2023-07-10T05:18:18Z) - UnifieR: A Unified Retriever for Large-Scale Retrieval [84.61239936314597]
Large-scale retrieval is to recall relevant documents from a huge collection given a query.
Recent retrieval methods based on pre-trained language models (PLM) can be coarsely categorized into either dense-vector or lexicon-based paradigms.
We propose a new learning framework, UnifieR which unifies dense-vector and lexicon-based retrieval in one model with a dual-representing capability.
arXiv Detail & Related papers (2022-05-23T11:01:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.