Context-faithful Prompting for Large Language Models
- URL: http://arxiv.org/abs/2303.11315v2
- Date: Mon, 23 Oct 2023 03:25:13 GMT
- Title: Context-faithful Prompting for Large Language Models
- Authors: Wenxuan Zhou, Sheng Zhang, Hoifung Poon, Muhao Chen
- Abstract summary: Large language models (LLMs) encode parametric knowledge about world facts.
Their reliance on parametric knowledge may cause them to overlook contextual cues, leading to incorrect predictions in context-sensitive NLP tasks.
We assess and enhance LLMs' contextual faithfulness in two aspects: knowledge conflict and prediction with abstention.
- Score: 51.194410884263135
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models (LLMs) encode parametric knowledge about world facts
and have shown remarkable performance in knowledge-driven NLP tasks. However,
their reliance on parametric knowledge may cause them to overlook contextual
cues, leading to incorrect predictions in context-sensitive NLP tasks (e.g.,
knowledge acquisition tasks). In this paper, we seek to assess and enhance
LLMs' contextual faithfulness in two aspects: knowledge conflict and prediction
with abstention. We demonstrate that LLMs' faithfulness can be significantly
improved using carefully designed prompting strategies. In particular, we
identify opinion-based prompts and counterfactual demonstrations as the most
effective methods. Opinion-based prompts reframe the context as a narrator's
statement and inquire about the narrator's opinions, while counterfactual
demonstrations use instances containing false facts to improve faithfulness in
knowledge conflict situations. Neither technique requires additional training.
We conduct experiments on three datasets of two standard NLP tasks, machine
reading comprehension and relation extraction, and the results demonstrate
significant improvement in faithfulness to contexts. Code and data are released
at https://github.com/wzhouad/context-faithful-llm.
Related papers
- On the loss of context-awareness in general instruction fine-tuning [101.03941308894191]
Post-training methods such as supervised fine-tuning (SFT) on instruction-response pairs can harm existing capabilities learned during pretraining.
We propose two methods to mitigate the loss of context awareness in instruct models: post-hoc attention steering on user prompts and conditional instruction fine-tuning with a context-dependency indicator.
arXiv Detail & Related papers (2024-11-05T00:16:01Z) - Recording for Eyes, Not Echoing to Ears: Contextualized Spoken-to-Written Conversion of ASR Transcripts [18.217375601357364]
We propose a Contextualized Spoken-to-Written conversion (CoS2W) task to address ASR and grammar errors.
This task naturally matches the in-context learning capabilities of Large Language Models (LLMs)
arXiv Detail & Related papers (2024-08-19T03:53:48Z) - Explainable Few-shot Knowledge Tracing [48.877979333221326]
We propose a cognition-guided framework that can track the student knowledge from a few student records while providing natural language explanations.
Experimental results from three widely used datasets show that LLMs can perform comparable or superior to competitive deep knowledge tracing methods.
arXiv Detail & Related papers (2024-05-23T10:07:21Z) - Enhancing Contextual Understanding in Large Language Models through Contrastive Decoding [9.2433070542025]
Large language models (LLMs) tend to inadequately integrate input context during text generation.
We introduce a novel approach integrating contrastive decoding with adversarial irrelevant passages as negative samples.
arXiv Detail & Related papers (2024-05-04T20:38:41Z) - C-ICL: Contrastive In-context Learning for Information Extraction [54.39470114243744]
c-ICL is a novel few-shot technique that leverages both correct and incorrect sample constructions to create in-context learning demonstrations.
Our experiments on various datasets indicate that c-ICL outperforms previous few-shot in-context learning methods.
arXiv Detail & Related papers (2024-02-17T11:28:08Z) - Uncertainty Quantification for In-Context Learning of Large Language Models [52.891205009620364]
In-context learning has emerged as a groundbreaking ability of Large Language Models (LLMs)
We propose a novel formulation and corresponding estimation method to quantify both types of uncertainties.
The proposed method offers an unsupervised way to understand the prediction of in-context learning in a plug-and-play fashion.
arXiv Detail & Related papers (2024-02-15T18:46:24Z) - Blending Reward Functions via Few Expert Demonstrations for Faithful and
Accurate Knowledge-Grounded Dialogue Generation [22.38338205905379]
We leverage reinforcement learning algorithms to overcome the above challenges by introducing a novel reward function.
Our reward function combines an accuracy metric and a faithfulness metric to provide a balanced quality judgment of generated responses.
arXiv Detail & Related papers (2023-11-02T02:42:41Z) - ICL-D3IE: In-Context Learning with Diverse Demonstrations Updating for
Document Information Extraction [56.790794611002106]
Large language models (LLMs) have demonstrated remarkable results in various natural language processing (NLP) tasks with in-context learning.
We propose a simple but effective in-context learning framework called ICL-D3IE.
Specifically, we extract the most difficult and distinct segments from hard training documents as hard demonstrations.
arXiv Detail & Related papers (2023-03-09T06:24:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.