Related papers: Understanding Knowledge Integration in Language Models with Graph Convolutions

Understanding Knowledge Integration in Language Models with Graph Convolutions

URL: http://arxiv.org/abs/2202.00964v1
Date: Wed, 2 Feb 2022 11:23:36 GMT
Title: Understanding Knowledge Integration in Language Models with Graph Convolutions
Authors: Yifan Hou, Guoji Fu, Mrinmaya Sachan
Abstract summary: knowledge integration (KI) methods aim to incorporate external knowledge into pretrained language models (LMs) This paper revisits the KI process in these models with an information-theoretic view and shows that KI can be interpreted using a graph convolution operation. We analyze two well-known knowledge-enhanced LMs: ERNIE and K-Adapter, and find that only a small amount of factual knowledge is integrated in them.
Score: 28.306949176011763
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Pretrained language models (LMs) do not capture factual knowledge very well. This has led to the development of a number of knowledge integration (KI) methods which aim to incorporate external knowledge into pretrained LMs. Even though KI methods show some performance gains over vanilla LMs, the inner-workings of these methods are not well-understood. For instance, it is unclear how and what kind of knowledge is effectively integrated into these models and if such integration may lead to catastrophic forgetting of already learned knowledge. This paper revisits the KI process in these models with an information-theoretic view and shows that KI can be interpreted using a graph convolution operation. We propose a probe model called \textit{Graph Convolution Simulator} (GCS) for interpreting knowledge-enhanced LMs and exposing what kind of knowledge is integrated into these models. We conduct experiments to verify that our GCS can indeed be used to correctly interpret the KI process, and we use it to analyze two well-known knowledge-enhanced LMs: ERNIE and K-Adapter, and find that only a small amount of factual knowledge is integrated in them. We stratify knowledge in terms of various relation types and find that ERNIE and K-Adapter integrate different kinds of knowledge to different extent. Our analysis also shows that simply increasing the size of the KI corpus may not lead to better KI; fundamental advances may be needed.

Related papers

How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training [92.88889953768455]
Large Language Models (LLMs) face a critical gap in understanding how they internalize new knowledge. We identify computational subgraphs that facilitate knowledge storage and processing.
arXiv Detail & Related papers (2025-02-16T16:55:43Z)
Chain-of-Knowledge: Integrating Knowledge Reasoning into Large Language Models by Learning from Knowledge Graphs [55.317267269115845]
Chain-of-Knowledge (CoK) is a comprehensive framework for knowledge reasoning. CoK includes methodologies for both dataset construction and model learning. We conduct extensive experiments with KnowReason.
arXiv Detail & Related papers (2024-06-30T10:49:32Z)
Leveraging Pedagogical Theories to Understand Student Learning Process with Graph-based Reasonable Knowledge Tracing [11.082908318943248]
We introduce GRKT, a graph-based reasonable knowledge tracing method to address these issues. We propose a fine-grained and psychological three-stage modeling process as knowledge retrieval, memory strengthening, and knowledge learning/forgetting.
arXiv Detail & Related papers (2024-06-07T10:14:30Z)
Knowledge Circuits in Pretrained Transformers [47.342682123081204]
The inner workings of how modern large language models store knowledge have long been a subject of intense interest and investigation among researchers. In this paper, we delve into the graph of the language model to uncover the knowledge circuits that are instrumental in articulating specific knowledge. We evaluate the impact of current knowledge editing techniques on these knowledge circuits, providing deeper insights into the functioning and constraints of these editing methodologies.
arXiv Detail & Related papers (2024-05-28T08:56:33Z)
Learning Beyond Pattern Matching? Assaying Mathematical Understanding in LLMs [58.09253149867228]
This paper assesses the domain knowledge of LLMs through its understanding of different mathematical skills required to solve problems. Motivated by the use of LLMs as a general scientific assistant, we propose textitNTKEval to assess changes in LLM's probability distribution. Our systematic analysis finds evidence of domain understanding during in-context learning. Certain instruction-tuning leads to similar performance changes irrespective of training on different data, suggesting a lack of domain understanding across different skills.
arXiv Detail & Related papers (2024-05-24T12:04:54Z)
Recognizing Unseen Objects via Multimodal Intensive Knowledge Graph Propagation [68.13453771001522]
We propose a multimodal intensive ZSL framework that matches regions of images with corresponding semantic embeddings. We conduct extensive experiments and evaluate our model on large-scale real-world data.
arXiv Detail & Related papers (2023-06-14T13:07:48Z)
Structured Knowledge Grounding for Question Answering [0.23068481501673416]
We propose to leverage the language and knowledge for knowledge based question-answering with flexibility, breadth of coverage and structured reasoning. Specifically, we devise a knowledge construction method that retrieves the relevant context with a dynamic hop. And we devise a deep fusion mechanism to further bridge the information exchanging bottleneck between the language and the knowledge.
arXiv Detail & Related papers (2022-09-17T08:48:50Z)
LM-CORE: Language Models with Contextually Relevant External Knowledge [13.451001884972033]
We argue that storing large amounts of knowledge in the model parameters is sub-optimal given the ever-growing amounts of knowledge and resource requirements. We present LM-CORE -- a general framework to achieve this -- that allows textitdecoupling of the language model training from the external knowledge source. Experimental results show that LM-CORE, having access to external knowledge, achieves significant and robust outperformance over state-of-the-art knowledge-enhanced language models on knowledge probing tasks.
arXiv Detail & Related papers (2022-08-12T18:59:37Z)
Enhancing Language Models with Plug-and-Play Large-Scale Commonsense [2.1248439796866228]
We study how to enhance language models (LMs) with textual commonsense knowledge. We propose a plug-and-play method for large-scale commonsense integration without pre-training.
arXiv Detail & Related papers (2021-09-06T16:16:10Z)
Towards a Universal Continuous Knowledge Base [49.95342223987143]
We propose a method for building a continuous knowledge base that can store knowledge imported from multiple neural networks. Experiments on text classification show promising results. We import the knowledge from multiple models to the knowledge base, from which the fused knowledge is exported back to a single model.
arXiv Detail & Related papers (2020-12-25T12:27:44Z)
CoLAKE: Contextualized Language and Knowledge Embedding [81.90416952762803]
We propose the Contextualized Language and Knowledge Embedding (CoLAKE) CoLAKE jointly learns contextualized representation for both language and knowledge with the extended objective. We conduct experiments on knowledge-driven tasks, knowledge probing tasks, and language understanding tasks.
arXiv Detail & Related papers (2020-10-01T11:39:32Z)
Common Sense or World Knowledge? Investigating Adapter-Based Knowledge Injection into Pretrained Transformers [54.417299589288184]
We investigate models for complementing the distributional knowledge of BERT with conceptual knowledge from ConceptNet and its corresponding Open Mind Common Sense (OMCS) corpus. Our adapter-based models substantially outperform BERT on inference tasks that require the type of conceptual knowledge explicitly present in ConceptNet and OMCS.
arXiv Detail & Related papers (2020-05-24T15:49:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.