Understanding Knowledge Integration in Language Models with Graph
Convolutions
- URL: http://arxiv.org/abs/2202.00964v1
- Date: Wed, 2 Feb 2022 11:23:36 GMT
- Title: Understanding Knowledge Integration in Language Models with Graph
Convolutions
- Authors: Yifan Hou, Guoji Fu, Mrinmaya Sachan
- Abstract summary: knowledge integration (KI) methods aim to incorporate external knowledge into pretrained language models (LMs)
This paper revisits the KI process in these models with an information-theoretic view and shows that KI can be interpreted using a graph convolution operation.
We analyze two well-known knowledge-enhanced LMs: ERNIE and K-Adapter, and find that only a small amount of factual knowledge is integrated in them.
- Score: 28.306949176011763
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pretrained language models (LMs) do not capture factual knowledge very well.
This has led to the development of a number of knowledge integration (KI)
methods which aim to incorporate external knowledge into pretrained LMs. Even
though KI methods show some performance gains over vanilla LMs, the
inner-workings of these methods are not well-understood. For instance, it is
unclear how and what kind of knowledge is effectively integrated into these
models and if such integration may lead to catastrophic forgetting of already
learned knowledge. This paper revisits the KI process in these models with an
information-theoretic view and shows that KI can be interpreted using a graph
convolution operation. We propose a probe model called \textit{Graph
Convolution Simulator} (GCS) for interpreting knowledge-enhanced LMs and
exposing what kind of knowledge is integrated into these models. We conduct
experiments to verify that our GCS can indeed be used to correctly interpret
the KI process, and we use it to analyze two well-known knowledge-enhanced LMs:
ERNIE and K-Adapter, and find that only a small amount of factual knowledge is
integrated in them. We stratify knowledge in terms of various relation types
and find that ERNIE and K-Adapter integrate different kinds of knowledge to
different extent. Our analysis also shows that simply increasing the size of
the KI corpus may not lead to better KI; fundamental advances may be needed.
Related papers
- Chain-of-Knowledge: Integrating Knowledge Reasoning into Large Language Models by Learning from Knowledge Graphs [55.317267269115845]
Chain-of-Knowledge (CoK) is a comprehensive framework for knowledge reasoning.
CoK includes methodologies for both dataset construction and model learning.
We conduct extensive experiments with KnowReason.
arXiv Detail & Related papers (2024-06-30T10:49:32Z) - Leveraging Pedagogical Theories to Understand Student Learning Process with Graph-based Reasonable Knowledge Tracing [11.082908318943248]
We introduce GRKT, a graph-based reasonable knowledge tracing method to address these issues.
We propose a fine-grained and psychological three-stage modeling process as knowledge retrieval, memory strengthening, and knowledge learning/forgetting.
arXiv Detail & Related papers (2024-06-07T10:14:30Z) - Knowledge Circuits in Pretrained Transformers [47.342682123081204]
The inner workings of how modern large language models store knowledge have long been a subject of intense interest and investigation among researchers.
In this paper, we delve into the graph of the language model to uncover the knowledge circuits that are instrumental in articulating specific knowledge.
We evaluate the impact of current knowledge editing techniques on these knowledge circuits, providing deeper insights into the functioning and constraints of these editing methodologies.
arXiv Detail & Related papers (2024-05-28T08:56:33Z) - Recognizing Unseen Objects via Multimodal Intensive Knowledge Graph
Propagation [68.13453771001522]
We propose a multimodal intensive ZSL framework that matches regions of images with corresponding semantic embeddings.
We conduct extensive experiments and evaluate our model on large-scale real-world data.
arXiv Detail & Related papers (2023-06-14T13:07:48Z) - Structured Knowledge Grounding for Question Answering [0.23068481501673416]
We propose to leverage the language and knowledge for knowledge based question-answering with flexibility, breadth of coverage and structured reasoning.
Specifically, we devise a knowledge construction method that retrieves the relevant context with a dynamic hop.
And we devise a deep fusion mechanism to further bridge the information exchanging bottleneck between the language and the knowledge.
arXiv Detail & Related papers (2022-09-17T08:48:50Z) - LM-CORE: Language Models with Contextually Relevant External Knowledge [13.451001884972033]
We argue that storing large amounts of knowledge in the model parameters is sub-optimal given the ever-growing amounts of knowledge and resource requirements.
We present LM-CORE -- a general framework to achieve this -- that allows textitdecoupling of the language model training from the external knowledge source.
Experimental results show that LM-CORE, having access to external knowledge, achieves significant and robust outperformance over state-of-the-art knowledge-enhanced language models on knowledge probing tasks.
arXiv Detail & Related papers (2022-08-12T18:59:37Z) - Enhancing Language Models with Plug-and-Play Large-Scale Commonsense [2.1248439796866228]
We study how to enhance language models (LMs) with textual commonsense knowledge.
We propose a plug-and-play method for large-scale commonsense integration without pre-training.
arXiv Detail & Related papers (2021-09-06T16:16:10Z) - Towards a Universal Continuous Knowledge Base [49.95342223987143]
We propose a method for building a continuous knowledge base that can store knowledge imported from multiple neural networks.
Experiments on text classification show promising results.
We import the knowledge from multiple models to the knowledge base, from which the fused knowledge is exported back to a single model.
arXiv Detail & Related papers (2020-12-25T12:27:44Z) - CoLAKE: Contextualized Language and Knowledge Embedding [81.90416952762803]
We propose the Contextualized Language and Knowledge Embedding (CoLAKE)
CoLAKE jointly learns contextualized representation for both language and knowledge with the extended objective.
We conduct experiments on knowledge-driven tasks, knowledge probing tasks, and language understanding tasks.
arXiv Detail & Related papers (2020-10-01T11:39:32Z) - Common Sense or World Knowledge? Investigating Adapter-Based Knowledge
Injection into Pretrained Transformers [54.417299589288184]
We investigate models for complementing the distributional knowledge of BERT with conceptual knowledge from ConceptNet and its corresponding Open Mind Common Sense (OMCS) corpus.
Our adapter-based models substantially outperform BERT on inference tasks that require the type of conceptual knowledge explicitly present in ConceptNet and OMCS.
arXiv Detail & Related papers (2020-05-24T15:49:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.