Related papers: Knowledge Graph Fusion for Language Model Fine-tuning

Knowledge Graph Fusion for Language Model Fine-tuning

URL: http://arxiv.org/abs/2206.14574v1
Date: Tue, 21 Jun 2022 08:06:22 GMT
Title: Knowledge Graph Fusion for Language Model Fine-tuning
Authors: Nimesh Bhana and Terence L. van Zyl
Abstract summary: We investigate the benefits of knowledge incorporation into the fine-tuning stages of BERT. An existing K-BERT model, which enriches sentences with triplets from a Knowledge Graph, is adapted for the English language. Changes made to K-BERT for accommodating the English language also extend to other word-based languages.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Language Models such as BERT have grown in popularity due to their ability to be pre-trained and perform robustly on a wide range of Natural Language Processing tasks. Often seen as an evolution over traditional word embedding techniques, they can produce semantic representations of text, useful for tasks such as semantic similarity. However, state-of-the-art models often have high computational requirements and lack global context or domain knowledge which is required for complete language understanding. To address these limitations, we investigate the benefits of knowledge incorporation into the fine-tuning stages of BERT. An existing K-BERT model, which enriches sentences with triplets from a Knowledge Graph, is adapted for the English language and extended to inject contextually relevant information into sentences. As a side-effect, changes made to K-BERT for accommodating the English language also extend to other word-based languages. Experiments conducted indicate that injected knowledge introduces noise. We see statistically significant improvements for knowledge-driven tasks when this noise is minimised. We show evidence that, given the appropriate task, modest injection with relevant, high-quality knowledge is most performant.

Related papers

Commonsense Knowledge Transfer for Pre-trained Language Models [83.01121484432801]
We introduce commonsense knowledge transfer, a framework to transfer the commonsense knowledge stored in a neural commonsense knowledge model to a general-purpose pre-trained language model. It first exploits general texts to form queries for extracting commonsense knowledge from the neural commonsense knowledge model. It then refines the language model with two self-supervised objectives: commonsense mask infilling and commonsense relation prediction.
arXiv Detail & Related papers (2023-06-04T15:44:51Z)
An Empirical Revisiting of Linguistic Knowledge Fusion in Language Understanding Tasks [33.765874588342285]
Infusing language models with syntactic or semantic knowledge from structural linguistic priors has shown improvements on many language understanding tasks. We conduct empirical study of replacing parsed graphs or trees with trivial ones for tasks in the GLUE benchmark. It reveals that the gains might not be significantly attributed to explicit linguistic priors but rather to more feature interactions brought by fusion layers.
arXiv Detail & Related papers (2022-10-24T07:47:32Z)
Transparency Helps Reveal When Language Models Learn Meaning [71.96920839263457]
Our systematic experiments with synthetic data reveal that, with languages where all expressions have context-independent denotations, both autoregressive and masked language models learn to emulate semantic relations between expressions. Turning to natural language, our experiments with a specific phenomenon -- referential opacity -- add to the growing body of evidence that current language models do not well-represent natural language semantics.
arXiv Detail & Related papers (2022-10-14T02:35:19Z)
DictBERT: Dictionary Description Knowledge Enhanced Language Model Pre-training via Contrastive Learning [18.838291575019504]
Pre-trained language models (PLMs) are shown to be lacking in knowledge when dealing with knowledge driven tasks. We propose textbfDictBERT, a novel approach that enhances PLMs with dictionary knowledge. We evaluate our approach on a variety of knowledge driven and language understanding tasks, including NER, relation extraction, CommonsenseQA, OpenBookQA and GLUE.
arXiv Detail & Related papers (2022-08-01T06:43:19Z)
Generated Knowledge Prompting for Commonsense Reasoning [53.88983683513114]
We propose generating knowledge statements directly from a language model with a generic prompt format. This approach improves performance of both off-the-shelf and finetuned language models on four commonsense reasoning tasks. Notably, we find that a model's predictions can improve when using its own generated knowledge.
arXiv Detail & Related papers (2021-10-15T21:58:03Z)
Distilling Linguistic Context for Language Model Compression [27.538080564616703]
A computationally expensive and memory intensive neural network lies behind the recent success of language representation learning. We present a new knowledge distillation objective for language representation learning that transfers the contextual knowledge via two types of relationships. We validate the effectiveness of our method on challenging benchmarks of language understanding tasks.
arXiv Detail & Related papers (2021-09-17T05:51:45Z)
A Closer Look at Linguistic Knowledge in Masked Language Models: The Case of Relative Clauses in American English [17.993417004424078]
Transformer-based language models achieve high performance on various tasks, but we still lack understanding of the kind of linguistic knowledge they learn and rely on. We evaluate three models (BERT, RoBERTa, and ALBERT) testing their grammatical and semantic knowledge by sentence-level probing, diagnostic cases, and masked prediction tasks.
arXiv Detail & Related papers (2020-11-02T13:25:39Z)
CoLAKE: Contextualized Language and Knowledge Embedding [81.90416952762803]
We propose the Contextualized Language and Knowledge Embedding (CoLAKE) CoLAKE jointly learns contextualized representation for both language and knowledge with the extended objective. We conduct experiments on knowledge-driven tasks, knowledge probing tasks, and language understanding tasks.
arXiv Detail & Related papers (2020-10-01T11:39:32Z)
E-BERT: A Phrase and Product Knowledge Enhanced Language Model for E-commerce [63.333860695727424]
E-commerce tasks require accurate understanding of domain phrases, whereas such fine-grained phrase-level knowledge is not explicitly modeled by BERT's training objective. To tackle the problem, we propose a unified pre-training framework, namely, E-BERT. Specifically, to preserve phrase-level knowledge, we introduce Adaptive Hybrid Masking, which allows the model to adaptively switch from learning preliminary word knowledge to learning complex phrases. To utilize product-level knowledge, we introduce Neighbor Product Reconstruction, which trains E-BERT to predict a product's associated neighbors with a denoising cross attention layer
arXiv Detail & Related papers (2020-09-07T00:15:36Z)
Data Annealing for Informal Language Understanding Tasks [66.2988222278475]
We propose a data annealing transfer learning procedure to bridge the performance gap on informal language tasks. It successfully utilizes a pre-trained model such as BERT in informal language.
arXiv Detail & Related papers (2020-04-24T09:27:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.