Related papers: From Parameters to Prompts: Understanding and Mitigating the Factuality Gap between Fine-Tuned LLMs

From Parameters to Prompts: Understanding and Mitigating the Factuality Gap between Fine-Tuned LLMs

URL: http://arxiv.org/abs/2505.23410v1
Date: Thu, 29 May 2025 12:59:30 GMT
Title: From Parameters to Prompts: Understanding and Mitigating the Factuality Gap between Fine-Tuned LLMs
Authors: Xuan Gong, Hanbo Huang, Shiyu Liang,
Abstract summary: We study the factuality gap that arises when fine-tuning on known versus unknown knowledge.<n>Our results shed light on the interaction between finetuning data and test-time prompt.
Score: 4.447729258258283
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Factual knowledge extraction aims to explicitly extract knowledge parameterized in pre-trained language models for application in downstream tasks. While prior work has been investigating the impact of supervised fine-tuning data on the factuality of large language models (LLMs), its mechanism remains poorly understood. We revisit this impact through systematic experiments, with a particular focus on the factuality gap that arises when fine-tuning on known versus unknown knowledge. Our findings show that this gap can be mitigated at the inference stage, either under out-of-distribution (OOD) settings or by using appropriate in-context learning (ICL) prompts (i.e., few-shot learning and Chain of Thought (CoT)). We prove this phenomenon theoretically from the perspective of knowledge graphs, showing that the test-time prompt may diminish or even overshadow the impact of fine-tuning data and play a dominant role in knowledge extraction. Ultimately, our results shed light on the interaction between finetuning data and test-time prompt, demonstrating that ICL can effectively compensate for shortcomings in fine-tuning data, and highlighting the need to reconsider the use of ICL prompting as a means to evaluate the effectiveness of fine-tuning data selection methods.

Related papers

Do-PFN: In-Context Learning for Causal Effect Estimation [75.62771416172109]
We show that Prior-data fitted networks (PFNs) can be pre-trained on synthetic data to predict outcomes.<n>Our approach allows for the accurate estimation of causal effects without knowledge of the underlying causal graph.
arXiv Detail & Related papers (2025-06-06T12:43:57Z)
Data Fusion for Partial Identification of Causal Effects [62.56890808004615]
We propose a novel partial identification framework that enables researchers to answer key questions.<n>Is the causal effect positive or negative? and How severe must assumption violations be to overturn this conclusion?<n>We apply our framework to the Project STAR study, which investigates the effect of classroom size on students' third-grade standardized test performance.
arXiv Detail & Related papers (2025-05-30T07:13:01Z)
UIPE: Enhancing LLM Unlearning by Removing Knowledge Related to Forgetting Targets [41.0340052199534]
Large Language Models (LLMs) inevitably acquire harmful information during training on massive datasets.<n>Existing unlearning methods focus on forgetting target data while overlooking the crucial impact of logically related knowledge on the effectiveness of unlearning.<n>We propose Unlearning Improvement via Extrapolation (UIPE), a method that removes knowledge highly correlated with the forgetting targets.
arXiv Detail & Related papers (2025-03-06T18:40:00Z)
Curriculum-style Data Augmentation for LLM-based Metaphor Detection [7.4594050203808395]
We propose a method for metaphor detection by fine-tuning open-source LLMs.<n>Our method achieves state-of-the-art performance across all baselines.
arXiv Detail & Related papers (2024-12-04T02:05:21Z)
On the Loss of Context-awareness in General Instruction Fine-tuning [101.03941308894191]
We investigate the loss of context awareness after supervised fine-tuning.<n>We find that the performance decline is associated with a bias toward different roles learned during conversational instruction fine-tuning.<n>We propose a metric to identify context-dependent examples from general instruction fine-tuning datasets.
arXiv Detail & Related papers (2024-11-05T00:16:01Z)
Semantic are Beacons: A Semantic Perspective for Unveiling Parameter-Efficient Fine-Tuning in Knowledge Learning [30.831866499812925]
We propose a semantic perspective to investigate the reasons behind PEFT's limitations in knowledge learning task. PEFT presents a notable risk of pushing the model away from the intended knowledge target. We introduce a data filtering strategy to exclude data that is detrimental to knowledge learning and a re-weighted learning strategy to make the model attentive to semantic distance.
arXiv Detail & Related papers (2024-05-28T15:47:11Z)
C-ICL: Contrastive In-context Learning for Information Extraction [54.39470114243744]
c-ICL is a novel few-shot technique that leverages both correct and incorrect sample constructions to create in-context learning demonstrations. Our experiments on various datasets indicate that c-ICL outperforms previous few-shot in-context learning methods.
arXiv Detail & Related papers (2024-02-17T11:28:08Z)
Knowledge Verification to Nip Hallucination in the Bud [69.79051730580014]
We demonstrate the feasibility of mitigating hallucinations by verifying and minimizing the inconsistency between external knowledge present in the alignment data and the intrinsic knowledge embedded within foundation LLMs. We propose a novel approach called Knowledge Consistent Alignment (KCA), which employs a well-aligned LLM to automatically formulate assessments based on external knowledge. We demonstrate the superior efficacy of KCA in reducing hallucinations across six benchmarks, utilizing foundation LLMs of varying backbones and scales.
arXiv Detail & Related papers (2024-01-19T15:39:49Z)
Provably Efficient Causal Reinforcement Learning with Confounded Observational Data [135.64775986546505]
We study how to incorporate the dataset (observational data) collected offline, which is often abundantly available in practice, to improve the sample efficiency in the online setting. We propose the deconfounded optimistic value iteration (DOVI) algorithm, which incorporates the confounded observational data in a provably efficient manner.
arXiv Detail & Related papers (2020-06-22T14:49:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.