Thrust: Adaptively Propels Large Language Models with External Knowledge
- URL: http://arxiv.org/abs/2307.10442v1
- Date: Wed, 19 Jul 2023 20:16:46 GMT
- Title: Thrust: Adaptively Propels Large Language Models with External Knowledge
- Authors: Xinran Zhao, Hongming Zhang, Xiaoman Pan, Wenlin Yao, Dong Yu, Jianshu
Chen
- Abstract summary: Large-scale pre-trained language models (PTLMs) are shown to encode rich knowledge in their model parameters.
The inherent knowledge in PTLMs can be opaque or static, making external knowledge necessary.
We propose the instance-level adaptive propulsion of external knowledge (IAPEK), where we only conduct the retrieval when necessary.
- Score: 58.72867916604562
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although large-scale pre-trained language models (PTLMs) are shown to encode
rich knowledge in their model parameters, the inherent knowledge in PTLMs can
be opaque or static, making external knowledge necessary. However, the existing
information retrieval techniques could be costly and may even introduce noisy
and sometimes misleading knowledge. To address these challenges, we propose the
instance-level adaptive propulsion of external knowledge (IAPEK), where we only
conduct the retrieval when necessary. To achieve this goal, we propose
measuring whether a PTLM contains enough knowledge to solve an instance with a
novel metric, Thrust, which leverages the representation distribution of a
small number of seen instances. Extensive experiments demonstrate that thrust
is a good measurement of PTLM models' instance-level knowledgeability.
Moreover, we can achieve significantly higher cost-efficiency with the Thrust
score as the retrieval indicator than the naive usage of external knowledge on
88% of the evaluated tasks with 26% average performance improvement. Such
findings shed light on the real-world practice of knowledge-enhanced LMs with a
limited knowledge-seeking budget due to computation latency or costs.
Related papers
- Evaluating the External and Parametric Knowledge Fusion of Large Language Models [72.40026897037814]
We develop a systematic pipeline for data construction and knowledge infusion to simulate knowledge fusion scenarios.
Our investigation reveals that enhancing parametric knowledge within LLMs can significantly bolster their capability for knowledge integration.
Our findings aim to steer future explorations on harmonizing external and parametric knowledge within LLMs.
arXiv Detail & Related papers (2024-05-29T11:48:27Z) - RECALL: A Benchmark for LLMs Robustness against External Counterfactual
Knowledge [69.79676144482792]
This study aims to evaluate the ability of LLMs to distinguish reliable information from external knowledge.
Our benchmark consists of two tasks, Question Answering and Text Generation, and for each task, we provide models with a context containing counterfactual information.
arXiv Detail & Related papers (2023-11-14T13:24:19Z) - Self-Knowledge Guided Retrieval Augmentation for Large Language Models [59.771098292611846]
Large language models (LLMs) have shown superior performance without task-specific fine-tuning.
Retrieval-based methods can offer non-parametric world knowledge and improve the performance on tasks such as question answering.
Self-Knowledge guided Retrieval augmentation (SKR) is a simple yet effective method which can let LLMs refer to the questions they have previously encountered.
arXiv Detail & Related papers (2023-10-08T04:22:33Z) - Augmenting LLMs with Knowledge: A survey on hallucination prevention [0.0]
This survey delves into the realm of language models (LMs) augmented with the ability to tap into external knowledge sources.
While adhering to the standard objective of predicting missing tokens, these augmented LMs leverage diverse, possibly non-parametric external modules.
arXiv Detail & Related papers (2023-09-28T14:09:58Z) - Knowledge Rumination for Pre-trained Language Models [77.55888291165462]
We propose a new paradigm dubbed Knowledge Rumination to help the pre-trained language model utilize related latent knowledge without retrieving it from the external corpus.
We apply the proposed knowledge rumination to various language models, including RoBERTa, DeBERTa, and GPT-3.
arXiv Detail & Related papers (2023-05-15T15:47:09Z) - When Not to Trust Language Models: Investigating Effectiveness of
Parametric and Non-Parametric Memories [58.3421305091187]
This paper aims to understand LMs' strengths and limitations in memorizing factual knowledge.
We find that LMs struggle with less popular factual knowledge, and that scaling fails to appreciably improve memorization of factual knowledge in the long tail.
We devise a simple, yet effective, method for powerful and efficient retrieval-augmented LMs, which retrieves non-parametric memories only when necessary.
arXiv Detail & Related papers (2022-12-20T18:30:15Z) - LM-CORE: Language Models with Contextually Relevant External Knowledge [13.451001884972033]
We argue that storing large amounts of knowledge in the model parameters is sub-optimal given the ever-growing amounts of knowledge and resource requirements.
We present LM-CORE -- a general framework to achieve this -- that allows textitdecoupling of the language model training from the external knowledge source.
Experimental results show that LM-CORE, having access to external knowledge, achieves significant and robust outperformance over state-of-the-art knowledge-enhanced language models on knowledge probing tasks.
arXiv Detail & Related papers (2022-08-12T18:59:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.