Establishing Knowledge Preference in Language Models
- URL: http://arxiv.org/abs/2407.13048v1
- Date: Wed, 17 Jul 2024 23:16:11 GMT
- Title: Establishing Knowledge Preference in Language Models
- Authors: Sizhe Zhou, Sha Li, Yu Meng, Yizhu Jiao, Heng Ji, Jiawei Han,
- Abstract summary: Language models are known to encode a great amount of factual knowledge through pretraining.
Such knowledge might be insufficient to cater to user requests.
When answering questions about ongoing events, the model should use recent news articles to update its response.
When some facts are edited in the model, the updated facts should override all prior knowledge learned by the model.
- Score: 80.70632813935644
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Language models are known to encode a great amount of factual knowledge through pretraining. However, such knowledge might be insufficient to cater to user requests, requiring the model to integrate external knowledge sources and adhere to user-provided specifications. When answering questions about ongoing events, the model should use recent news articles to update its response; when asked to provide recommendations, the model should prioritize user specifications over retrieved product reviews; when some facts are edited in the model, the updated facts should override all prior knowledge learned by the model even if they are conflicting. In all of the cases above, the model faces a decision between its own parametric knowledge, (retrieved) contextual knowledge, and user instruction knowledge. In this paper, we (1) unify such settings into the problem of knowledge preference and define a three-level preference hierarchy over these knowledge sources; (2) compile a collection of existing datasets IfQA, MQuAKE, and MRQA covering a combination of settings (with/without user specifications, with/without context documents) to systematically evaluate how well models obey the intended knowledge preference; and (3) propose a dataset synthesis method that composes diverse question-answer pairs with user assumptions and related context to directly fine-tune LMs for instilling the hierarchy of knowledge. We demonstrate that a 7B model, fine-tuned on only a few thousand examples automatically generated by our proposed method, effectively achieves superior performance (more than 18% improvement across all evaluation benchmarks) in adhering to the desired knowledge preference hierarchy.
Related papers
- Gradual Learning: Optimizing Fine-Tuning with Partially Mastered Knowledge in Large Language Models [51.20499954955646]
Large language models (LLMs) acquire vast amounts of knowledge from extensive text corpora during the pretraining phase.
In later stages such as fine-tuning and inference, the model may encounter knowledge not covered in the initial training.
We propose a two-stage fine-tuning strategy to improve the model's overall test accuracy and knowledge retention.
arXiv Detail & Related papers (2024-10-08T08:35:16Z) - Towards Better Generalization in Open-Domain Question Answering by Mitigating Context Memorization [67.92796510359595]
Open-domain Question Answering (OpenQA) aims at answering factual questions with an external large-scale knowledge corpus.
It is still unclear how well an OpenQA model can transfer to completely new knowledge domains.
We introduce Corpus-Invariant Tuning (CIT), a simple but effective training strategy, to mitigate the knowledge over-memorization.
arXiv Detail & Related papers (2024-04-02T05:44:50Z) - Robust and Scalable Model Editing for Large Language Models [75.95623066605259]
We propose EREN (Edit models by REading Notes) to improve the scalability and robustness of LLM editing.
Unlike existing techniques, it can integrate knowledge from multiple edits, and correctly respond to syntactically similar but semantically unrelated inputs.
arXiv Detail & Related papers (2024-03-26T06:57:23Z) - Evaluating Correctness and Faithfulness of Instruction-Following Models for Question Answering [26.34649731975005]
Retriever-augmented instruction-following models are attractive alternatives to fine-tuned approaches for question answering (QA)
While the model responses tend to be natural and fluent, the additional verbosity makes traditional QA evaluation metrics unreliable for accurately quantifying model performance.
We use both automatic and human evaluation to evaluate these models along two dimensions: 1) how well they satisfy the user's information need (correctness) and 2) whether they produce a response based on the provided knowledge (faithfulness)
arXiv Detail & Related papers (2023-07-31T17:41:00Z) - Task Oriented Conversational Modelling With Subjective Knowledge [0.0]
DSTC-11 proposes a three stage pipeline consisting of knowledge seeking turn detection, knowledge selection and response generation.
We propose entity retrieval methods which result in an accurate and faster knowledge search.
Preliminary results show a 4 % improvement in exact match score on knowledge selection task.
arXiv Detail & Related papers (2023-03-30T20:23:49Z) - The KITMUS Test: Evaluating Knowledge Integration from Multiple Sources
in Natural Language Understanding Systems [87.3207729953778]
We evaluate state-of-the-art coreference resolution models on our dataset.
Several models struggle to reason on-the-fly over knowledge observed both at pretrain time and at inference time.
Still, even the best performing models seem to have difficulties with reliably integrating knowledge presented only at inference time.
arXiv Detail & Related papers (2022-12-15T23:26:54Z) - Large Language Models with Controllable Working Memory [64.71038763708161]
Large language models (LLMs) have led to a series of breakthroughs in natural language processing (NLP)
What further sets these models apart is the massive amounts of world knowledge they internalize during pretraining.
How the model's world knowledge interacts with the factual information presented in the context remains under explored.
arXiv Detail & Related papers (2022-11-09T18:58:29Z) - Joint Reasoning on Hybrid-knowledge sources for Task-Oriented Dialog [12.081212540168055]
We present a modified version of the MutliWOZ based dataset prepared by SeKnow to demonstrate how current methods have significant degradation in performance.
In line with recent work exploiting pre-trained language models, we fine-tune a BART based model using prompts for the tasks of querying knowledge sources.
We demonstrate that our model is robust to perturbations to knowledge modality (source of information) and that it can fuse information from structured as well as unstructured knowledge to generate responses.
arXiv Detail & Related papers (2022-10-13T18:49:59Z) - Variational Learning for Unsupervised Knowledge Grounded Dialogs [6.761874595503588]
Recent methods for knowledge grounded dialogs generate responses by incorporating information from an external textual document.
We develop a variational approach to the above technique wherein, we instead maximize the Evidence Lower bound (ELBO)
To the best of our knowledge we are the first to apply variational training for open-scale unsupervised knowledge grounded dialog systems.
arXiv Detail & Related papers (2021-11-23T13:41:03Z) - REALM: Retrieval-Augmented Language Model Pre-Training [37.3178586179607]
We augment language model pre-training with a latent knowledge retriever, which allows the model to retrieve and attend over documents from a large corpus such as Wikipedia.
For the first time, we show how to pre-train such a knowledge retriever in an unsupervised manner.
We demonstrate the effectiveness of Retrieval-Augmented Language Model pre-training (REALM) by fine-tuning on the challenging task of Open-domain Question Answering (Open-QA)
arXiv Detail & Related papers (2020-02-10T18:40:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.