LM-CORE: Language Models with Contextually Relevant External Knowledge
- URL: http://arxiv.org/abs/2208.06458v1
- Date: Fri, 12 Aug 2022 18:59:37 GMT
- Title: LM-CORE: Language Models with Contextually Relevant External Knowledge
- Authors: Jivat Neet Kaur and Sumit Bhatia and Milan Aggarwal and Rachit Bansal
and Balaji Krishnamurthy
- Abstract summary: We argue that storing large amounts of knowledge in the model parameters is sub-optimal given the ever-growing amounts of knowledge and resource requirements.
We present LM-CORE -- a general framework to achieve this -- that allows textitdecoupling of the language model training from the external knowledge source.
Experimental results show that LM-CORE, having access to external knowledge, achieves significant and robust outperformance over state-of-the-art knowledge-enhanced language models on knowledge probing tasks.
- Score: 13.451001884972033
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large transformer-based pre-trained language models have achieved impressive
performance on a variety of knowledge-intensive tasks and can capture factual
knowledge in their parameters. We argue that storing large amounts of knowledge
in the model parameters is sub-optimal given the ever-growing amounts of
knowledge and resource requirements. We posit that a more efficient alternative
is to provide explicit access to contextually relevant structured knowledge to
the model and train it to use that knowledge. We present LM-CORE -- a general
framework to achieve this -- that allows \textit{decoupling} of the language
model training from the external knowledge source and allows the latter to be
updated without affecting the already trained model. Experimental results show
that LM-CORE, having access to external knowledge, achieves significant and
robust outperformance over state-of-the-art knowledge-enhanced language models
on knowledge probing tasks; can effectively handle knowledge updates; and
performs well on two downstream tasks. We also present a thorough error
analysis highlighting the successes and failures of LM-CORE.
Related papers
- KEHRL: Learning Knowledge-Enhanced Language Representations with Hierarchical Reinforcement Learning [32.086825891769585]
Knowledge-enhanced pre-trained language models (KEPLMs) leverage relation triples from knowledge graphs (KGs)
Previous works treat knowledge enhancement as two independent operations, i.e., knowledge injection and knowledge integration.
This paper jointly addresses the problems of detecting positions for knowledge injection and integrating external knowledge into the model in order to avoid injecting inaccurate or irrelevant knowledge.
arXiv Detail & Related papers (2024-06-24T07:32:35Z) - Large Language Models are Limited in Out-of-Context Knowledge Reasoning [65.72847298578071]
Large Language Models (LLMs) possess extensive knowledge and strong capabilities in performing in-context reasoning.
This paper focuses on a significant aspect of out-of-context reasoning: Out-of-Context Knowledge Reasoning (OCKR), which is to combine multiple knowledge to infer new knowledge.
arXiv Detail & Related papers (2024-06-11T15:58:59Z) - TRELM: Towards Robust and Efficient Pre-training for Knowledge-Enhanced Language Models [31.209774088374374]
This paper introduces TRELM, a Robust and Efficient Pre-training framework for Knowledge-Enhanced Language Models.
We employ a robust approach to inject knowledge triples and employ a knowledge-augmented memory bank to capture valuable information.
We show that TRELM reduces pre-training time by at least 50% and outperforms other KEPLMs in knowledge probing tasks and multiple knowledge-aware language understanding tasks.
arXiv Detail & Related papers (2024-03-17T13:04:35Z) - Knowledge Rumination for Pre-trained Language Models [77.55888291165462]
We propose a new paradigm dubbed Knowledge Rumination to help the pre-trained language model utilize related latent knowledge without retrieving it from the external corpus.
We apply the proposed knowledge rumination to various language models, including RoBERTa, DeBERTa, and GPT-3.
arXiv Detail & Related papers (2023-05-15T15:47:09Z) - UNTER: A Unified Knowledge Interface for Enhancing Pre-trained Language
Models [100.4659557650775]
We propose a UNified knowledge inTERface, UNTER, to provide a unified perspective to exploit both structured knowledge and unstructured knowledge.
With both forms of knowledge injected, UNTER gains continuous improvements on a series of knowledge-driven NLP tasks.
arXiv Detail & Related papers (2023-05-02T17:33:28Z) - Kformer: Knowledge Injection in Transformer Feed-Forward Layers [107.71576133833148]
We propose a novel knowledge fusion model, namely Kformer, which incorporates external knowledge through the feed-forward layer in Transformer.
We empirically find that simply injecting knowledge into FFN can facilitate the pre-trained language model's ability and facilitate current knowledge fusion methods.
arXiv Detail & Related papers (2022-01-15T03:00:27Z) - DKPLM: Decomposable Knowledge-enhanced Pre-trained Language Model for
Natural Language Understanding [19.478288026844893]
Knowledge-Enhanced Pre-trained Language Models (KEPLMs) are pre-trained models with relation triples injecting from knowledge graphs to improve language understanding abilities.
Previous studies integrate models with knowledge encoders for representing knowledge retrieved from knowledge graphs.
We propose a novel KEPLM named DKPLM that Decomposes Knowledge injection process of the Pre-trained Language Models in pre-training, fine-tuning and inference stages.
arXiv Detail & Related papers (2021-12-02T08:19:42Z) - Generated Knowledge Prompting for Commonsense Reasoning [53.88983683513114]
We propose generating knowledge statements directly from a language model with a generic prompt format.
This approach improves performance of both off-the-shelf and finetuned language models on four commonsense reasoning tasks.
Notably, we find that a model's predictions can improve when using its own generated knowledge.
arXiv Detail & Related papers (2021-10-15T21:58:03Z) - Language Models as a Knowledge Source for Cognitive Agents [9.061356032792954]
Language models (LMs) are sentence-completion engines trained on massive corpora.
This paper outlines the challenges and opportunities for using language models as a new knowledge source for cognitive systems.
It also identifies possible ways to improve knowledge extraction from language models using the capabilities provided by cognitive systems.
arXiv Detail & Related papers (2021-09-17T01:12:34Z) - CoLAKE: Contextualized Language and Knowledge Embedding [81.90416952762803]
We propose the Contextualized Language and Knowledge Embedding (CoLAKE)
CoLAKE jointly learns contextualized representation for both language and knowledge with the extended objective.
We conduct experiments on knowledge-driven tasks, knowledge probing tasks, and language understanding tasks.
arXiv Detail & Related papers (2020-10-01T11:39:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.