Knowledge Graph Fusion for Language Model Fine-tuning
- URL: http://arxiv.org/abs/2206.14574v1
- Date: Tue, 21 Jun 2022 08:06:22 GMT
- Title: Knowledge Graph Fusion for Language Model Fine-tuning
- Authors: Nimesh Bhana and Terence L. van Zyl
- Abstract summary: We investigate the benefits of knowledge incorporation into the fine-tuning stages of BERT.
An existing K-BERT model, which enriches sentences with triplets from a Knowledge Graph, is adapted for the English language.
Changes made to K-BERT for accommodating the English language also extend to other word-based languages.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Language Models such as BERT have grown in popularity due to their ability to
be pre-trained and perform robustly on a wide range of Natural Language
Processing tasks. Often seen as an evolution over traditional word embedding
techniques, they can produce semantic representations of text, useful for tasks
such as semantic similarity. However, state-of-the-art models often have high
computational requirements and lack global context or domain knowledge which is
required for complete language understanding. To address these limitations, we
investigate the benefits of knowledge incorporation into the fine-tuning stages
of BERT. An existing K-BERT model, which enriches sentences with triplets from
a Knowledge Graph, is adapted for the English language and extended to inject
contextually relevant information into sentences. As a side-effect, changes
made to K-BERT for accommodating the English language also extend to other
word-based languages. Experiments conducted indicate that injected knowledge
introduces noise. We see statistically significant improvements for
knowledge-driven tasks when this noise is minimised. We show evidence that,
given the appropriate task, modest injection with relevant, high-quality
knowledge is most performant.
Related papers
- Commonsense Knowledge Transfer for Pre-trained Language Models [83.01121484432801]
We introduce commonsense knowledge transfer, a framework to transfer the commonsense knowledge stored in a neural commonsense knowledge model to a general-purpose pre-trained language model.
It first exploits general texts to form queries for extracting commonsense knowledge from the neural commonsense knowledge model.
It then refines the language model with two self-supervised objectives: commonsense mask infilling and commonsense relation prediction.
arXiv Detail & Related papers (2023-06-04T15:44:51Z) - An Empirical Revisiting of Linguistic Knowledge Fusion in Language
Understanding Tasks [33.765874588342285]
Infusing language models with syntactic or semantic knowledge from structural linguistic priors has shown improvements on many language understanding tasks.
We conduct empirical study of replacing parsed graphs or trees with trivial ones for tasks in the GLUE benchmark.
It reveals that the gains might not be significantly attributed to explicit linguistic priors but rather to more feature interactions brought by fusion layers.
arXiv Detail & Related papers (2022-10-24T07:47:32Z) - Transparency Helps Reveal When Language Models Learn Meaning [71.96920839263457]
Our systematic experiments with synthetic data reveal that, with languages where all expressions have context-independent denotations, both autoregressive and masked language models learn to emulate semantic relations between expressions.
Turning to natural language, our experiments with a specific phenomenon -- referential opacity -- add to the growing body of evidence that current language models do not well-represent natural language semantics.
arXiv Detail & Related papers (2022-10-14T02:35:19Z) - DictBERT: Dictionary Description Knowledge Enhanced Language Model
Pre-training via Contrastive Learning [18.838291575019504]
Pre-trained language models (PLMs) are shown to be lacking in knowledge when dealing with knowledge driven tasks.
We propose textbfDictBERT, a novel approach that enhances PLMs with dictionary knowledge.
We evaluate our approach on a variety of knowledge driven and language understanding tasks, including NER, relation extraction, CommonsenseQA, OpenBookQA and GLUE.
arXiv Detail & Related papers (2022-08-01T06:43:19Z) - Generated Knowledge Prompting for Commonsense Reasoning [53.88983683513114]
We propose generating knowledge statements directly from a language model with a generic prompt format.
This approach improves performance of both off-the-shelf and finetuned language models on four commonsense reasoning tasks.
Notably, we find that a model's predictions can improve when using its own generated knowledge.
arXiv Detail & Related papers (2021-10-15T21:58:03Z) - Distilling Linguistic Context for Language Model Compression [27.538080564616703]
A computationally expensive and memory intensive neural network lies behind the recent success of language representation learning.
We present a new knowledge distillation objective for language representation learning that transfers the contextual knowledge via two types of relationships.
We validate the effectiveness of our method on challenging benchmarks of language understanding tasks.
arXiv Detail & Related papers (2021-09-17T05:51:45Z) - A Closer Look at Linguistic Knowledge in Masked Language Models: The
Case of Relative Clauses in American English [17.993417004424078]
Transformer-based language models achieve high performance on various tasks, but we still lack understanding of the kind of linguistic knowledge they learn and rely on.
We evaluate three models (BERT, RoBERTa, and ALBERT) testing their grammatical and semantic knowledge by sentence-level probing, diagnostic cases, and masked prediction tasks.
arXiv Detail & Related papers (2020-11-02T13:25:39Z) - CoLAKE: Contextualized Language and Knowledge Embedding [81.90416952762803]
We propose the Contextualized Language and Knowledge Embedding (CoLAKE)
CoLAKE jointly learns contextualized representation for both language and knowledge with the extended objective.
We conduct experiments on knowledge-driven tasks, knowledge probing tasks, and language understanding tasks.
arXiv Detail & Related papers (2020-10-01T11:39:32Z) - E-BERT: A Phrase and Product Knowledge Enhanced Language Model for
E-commerce [63.333860695727424]
E-commerce tasks require accurate understanding of domain phrases, whereas such fine-grained phrase-level knowledge is not explicitly modeled by BERT's training objective.
To tackle the problem, we propose a unified pre-training framework, namely, E-BERT.
Specifically, to preserve phrase-level knowledge, we introduce Adaptive Hybrid Masking, which allows the model to adaptively switch from learning preliminary word knowledge to learning complex phrases.
To utilize product-level knowledge, we introduce Neighbor Product Reconstruction, which trains E-BERT to predict a product's associated neighbors with a denoising cross attention layer
arXiv Detail & Related papers (2020-09-07T00:15:36Z) - Data Annealing for Informal Language Understanding Tasks [66.2988222278475]
We propose a data annealing transfer learning procedure to bridge the performance gap on informal language tasks.
It successfully utilizes a pre-trained model such as BERT in informal language.
arXiv Detail & Related papers (2020-04-24T09:27:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.