Related papers: Learning to Generate Answers with Citations via Factual Consistency Models

Learning to Generate Answers with Citations via Factual Consistency Models

URL: http://arxiv.org/abs/2406.13124v2
Date: Mon, 15 Jul 2024 16:04:05 GMT
Title: Learning to Generate Answers with Citations via Factual Consistency Models
Authors: Rami Aly, Zhiqiang Tang, Samson Tan, George Karypis,
Abstract summary: Large Language Models (LLMs) frequently hallucinate, impeding their reliability in mission-critical situations. This paper proposes a weakly-supervised fine-tuning method leveraging factual consistency models (FCMs) Focused learning is integrated into the objective, directing the fine-tuning process to emphasise the factual unit tokens.
Score: 28.716998866121923
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large Language Models (LLMs) frequently hallucinate, impeding their reliability in mission-critical situations. One approach to address this issue is to provide citations to relevant sources alongside generated content, enhancing the verifiability of generations. However, citing passages accurately in answers remains a substantial challenge. This paper proposes a weakly-supervised fine-tuning method leveraging factual consistency models (FCMs). Our approach alternates between generating texts with citations and supervised fine-tuning with FCM-filtered citation data. Focused learning is integrated into the objective, directing the fine-tuning process to emphasise the factual unit tokens, as measured by an FCM. Results on the ALCE few-shot citation benchmark with various instruction-tuned LLMs demonstrate superior performance compared to in-context learning, vanilla supervised fine-tuning, and state-of-the-art methods, with an average improvement of $34.1$, $15.5$, and $10.5$ citation F$_1$ points, respectively. Moreover, in a domain transfer setting we show that the obtained citation generation ability robustly transfers to unseen datasets. Notably, our citation improvements contribute to the lowest factual error rate across baselines.

Related papers

Refining Sentence Embedding Model through Ranking Sentences Generation with Large Language Models [60.00178316095646]
Sentence embedding is essential for many NLP tasks, with contrastive learning methods achieving strong performance using datasets like NLI. Recent studies leverage large language models (LLMs) to generate sentence pairs, reducing annotation dependency. We propose a method for controlling the generation direction of LLMs in the latent space. Unlike unconstrained generation, the controlled approach ensures meaningful semantic divergence. Experiments on multiple benchmarks demonstrate that our method achieves new SOTA performance with a modest cost in ranking sentence synthesis.
arXiv Detail & Related papers (2025-02-19T12:07:53Z)
SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models [51.90867482317985]
SelfCite is a self-supervised approach that aligns LLMs to generate high-quality, fine-grained, sentence-level citations for statements in generated responses. Instead of relying on costly and labor-intensive annotations, SelfCite leverages a reward signal provided by the LLM itself through context ablation. The effectiveness of SelfCite is demonstrated by increasing citation F1 up to 5.3 points on the LongBench-Cite benchmark across five long-form question answering tasks.
arXiv Detail & Related papers (2025-02-13T18:55:13Z)
On the Capacity of Citation Generation by Large Language Models [38.47160164251295]
Retrieval-augmented generation (RAG) appears as a promising method to alleviate the "hallucination" problem in large language models (LLMs)
arXiv Detail & Related papers (2024-10-15T03:04:26Z)
Localizing Factual Inconsistencies in Attributable Text Generation [91.981439746404]
We introduce QASemConsistency, a new formalism for localizing factual inconsistencies in attributable text generation. We first demonstrate the effectiveness of the QASemConsistency methodology for human annotation. We then implement several methods for automatically detecting localized factual inconsistencies.
arXiv Detail & Related papers (2024-10-09T22:53:48Z)
Learning Fine-Grained Grounded Citations for Attributed Large Language Models [44.79328335487421]
Front is a training framework designed to teach large language models (LLMs) to generate Fine-Grained Grounded Citations. Experiments on the ALCE benchmark demonstrate the efficacy of FRONT in generating superior grounded responses and highly supportive citations.
arXiv Detail & Related papers (2024-08-08T16:28:22Z)
Citekit: A Modular Toolkit for Large Language Model Citation Generation [20.509394248001723]
Large Language Models (LLMs) generate citations in Question-Answering (QA) tasks. There is currently no unified framework to standardize and fairly compare different citation generation methods. We introduce name, an open-source and modular toolkit designed to facilitate the implementation and evaluation of existing citation generation methods.
arXiv Detail & Related papers (2024-08-06T02:13:15Z)
Fine-Tuning with Divergent Chains of Thought Boosts Reasoning Through Self-Correction in Language Models [63.36637269634553]
We present a novel method of further improving performance by requiring models to compare multiple reasoning chains. We find that instruction tuning on DCoT datasets boosts the performance of even smaller, and therefore more accessible, language models.
arXiv Detail & Related papers (2024-07-03T15:01:18Z)
Ground Every Sentence: Improving Retrieval-Augmented LLMs with Interleaved Reference-Claim Generation [51.8188846284153]
RAG has been widely adopted to enhance Large Language Models (LLMs) Attributed Text Generation (ATG) has attracted growing attention, which provides citations to support the model's responses in RAG. This paper proposes a fine-grained ATG method called ReClaim(Refer & Claim), which alternates the generation of references and answers step by step.
arXiv Detail & Related papers (2024-07-01T20:47:47Z)
ALiiCE: Evaluating Positional Fine-grained Citation Generation [54.19617927314975]
We propose ALiiCE, the first automatic evaluation framework for fine-grained citation generation. Our framework first parses the sentence claim into atomic claims via dependency analysis and then calculates citation quality at the atomic claim level. We evaluate the positional fine-grained citation generation performance of several Large Language Models on two long-form QA datasets.
arXiv Detail & Related papers (2024-06-19T09:16:14Z)
Verifiable by Design: Aligning Language Models to Quote from Pre-Training Data [48.409306245463]
We develop models that quote verbatim statements from trusted sources in their pre-training data. The core of Quote-Tuning is a fast membership inference function that efficiently verifies text against trusted corpora. Experiments show that Quote-Tuning significantly increases verbatim quotes from high-quality documents by up to 130% relative to base models.
arXiv Detail & Related papers (2024-04-05T02:27:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.