Distilling Relation Embeddings from Pre-trained Language Models
- URL: http://arxiv.org/abs/2110.15705v1
- Date: Tue, 21 Sep 2021 15:05:27 GMT
- Title: Distilling Relation Embeddings from Pre-trained Language Models
- Authors: Asahi Ushio and Jose Camacho-Collados and Steven Schockaert
- Abstract summary: We show that it is possible to distill relation embeddings from pre-trained language models.
We encode word pairs using a (manually or automatically generated) prompt, and we fine-tune the language model.
The resulting relation embeddings are highly competitive on analogy (unsupervised) and relation classification (supervised) benchmarks.
- Score: 35.718167335989854
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Pre-trained language models have been found to capture a surprisingly rich
amount of lexical knowledge, ranging from commonsense properties of everyday
concepts to detailed factual knowledge about named entities. Among others, this
makes it possible to distill high-quality word vectors from pre-trained
language models. However, it is currently unclear to what extent it is possible
to distill relation embeddings, i.e. vectors that characterize the relationship
between two words. Such relation embeddings are appealing because they can, in
principle, encode relational knowledge in a more fine-grained way than is
possible with knowledge graphs. To obtain relation embeddings from a
pre-trained language model, we encode word pairs using a (manually or
automatically generated) prompt, and we fine-tune the language model such that
relationally similar word pairs yield similar output vectors. We find that the
resulting relation embeddings are highly competitive on analogy (unsupervised)
and relation classification (supervised) benchmarks, even without any
task-specific fine-tuning. Source code to reproduce our experimental results
and the model checkpoints are available in the following repository:
https://github.com/asahi417/relbert
Related papers
- Transparency at the Source: Evaluating and Interpreting Language Models
With Access to the True Distribution [4.01799362940916]
We present a setup for training, evaluating and interpreting neural language models, that uses artificial, language-like data.
The data is generated using a massive probabilistic grammar, that is itself derived from a large natural language corpus.
With access to the underlying true source, our results show striking differences and outcomes in learning dynamics between different classes of words.
arXiv Detail & Related papers (2023-10-23T12:03:01Z) - Relational Sentence Embedding for Flexible Semantic Matching [86.21393054423355]
We present Sentence Embedding (RSE), a new paradigm to discover further the potential of sentence embeddings.
RSE is effective and flexible in modeling sentence relations and outperforms a series of state-of-the-art embedding methods.
arXiv Detail & Related papers (2022-12-17T05:25:17Z) - Modelling Commonsense Properties using Pre-Trained Bi-Encoders [40.327695801431375]
We study the possibility of fine-tuning language models to explicitly model concepts and their properties.
Our experimental results show that the resulting encoders allow us to predict commonsense properties with much higher accuracy than is possible.
arXiv Detail & Related papers (2022-10-06T09:17:34Z) - Towards a Theoretical Understanding of Word and Relation Representation [8.020742121274418]
Representing words by vectors, or embeddings, enables computational reasoning.
We focus on word embeddings learned from text corpora and knowledge graphs.
arXiv Detail & Related papers (2022-02-01T15:34:58Z) - BERT is to NLP what AlexNet is to CV: Can Pre-Trained Language Models
Identify Analogies? [35.381345454627]
We analyze the capabilities of transformer-based language models on an unsupervised task of identifying analogies.
Off-the-shelf language models can identify analogies to a certain extent, but struggle with abstract and complex relations.
Our results raise important questions for future work about how, and to what extent, pre-trained language models capture knowledge about abstract semantic relations.
arXiv Detail & Related papers (2021-05-11T11:38:49Z) - Paraphrastic Representations at Scale [134.41025103489224]
We release trained models for English, Arabic, German, French, Spanish, Russian, Turkish, and Chinese languages.
We train these models on large amounts of data, achieving significantly improved performance from the original papers.
arXiv Detail & Related papers (2021-04-30T16:55:28Z) - Prototypical Representation Learning for Relation Extraction [56.501332067073065]
This paper aims to learn predictive, interpretable, and robust relation representations from distantly-labeled data.
We learn prototypes for each relation from contextual information to best explore the intrinsic semantics of relations.
Results on several relation learning tasks show that our model significantly outperforms the previous state-of-the-art relational models.
arXiv Detail & Related papers (2021-03-22T08:11:43Z) - Unnatural Language Inference [48.45003475966808]
We find that state-of-the-art NLI models, such as RoBERTa and BART, are invariant to, and sometimes even perform better on, examples with randomly reordered words.
Our findings call into question the idea that our natural language understanding models, and the tasks used for measuring their progress, genuinely require a human-like understanding of syntax.
arXiv Detail & Related papers (2020-12-30T20:40:48Z) - Fusing Context Into Knowledge Graph for Commonsense Reasoning [21.33294077354958]
We propose to utilize external entity description to provide contextual information for graph entities.
For the CommonsenseQA task, our model first extracts concepts from the question and choice, and then finds a related triple between these concepts.
We achieve state-of-the-art results in the CommonsenseQA dataset with an accuracy of 80.7% (single model) and 83.3% (ensemble model) on the official leaderboard.
arXiv Detail & Related papers (2020-12-09T00:57:49Z) - Learning Relation Prototype from Unlabeled Texts for Long-tail Relation
Extraction [84.64435075778988]
We propose a general approach to learn relation prototypes from unlabeled texts.
We learn relation prototypes as an implicit factor between entities.
We conduct experiments on two publicly available datasets: New York Times and Google Distant Supervision.
arXiv Detail & Related papers (2020-11-27T06:21:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.