Modelling General Properties of Nouns by Selectively Averaging
Contextualised Embeddings
- URL: http://arxiv.org/abs/2012.07580v2
- Date: Mon, 17 May 2021 15:00:19 GMT
- Title: Modelling General Properties of Nouns by Selectively Averaging
Contextualised Embeddings
- Authors: Na Li, Zied Bouraoui, Jose Camacho Collados, Luis Espinosa-Anke, Qing
Gu, Steven Schockaert
- Abstract summary: We show how the contextualised embeddings predicted by BERT can be used to produce high-quality word vectors.
We find that a simple strategy of averaging the contextualised embeddings of masked word mentions leads to vectors that outperform the static word vectors.
- Score: 46.49372320363155
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While the success of pre-trained language models has largely eliminated the
need for high-quality static word vectors in many NLP applications, such
vectors continue to play an important role in tasks where words need to be
modelled in the absence of linguistic context. In this paper, we explore how
the contextualised embeddings predicted by BERT can be used to produce
high-quality word vectors for such domains, in particular related to knowledge
base completion, where our focus is on capturing the semantic properties of
nouns. We find that a simple strategy of averaging the contextualised
embeddings of masked word mentions leads to vectors that outperform the static
word vectors learned by BERT, as well as those from standard word embedding
models, in property induction tasks. We notice in particular that masking
target words is critical to achieve this strong performance, as the resulting
vectors focus less on idiosyncratic properties and more on general semantic
properties. Inspired by this view, we propose a filtering strategy which is
aimed at removing the most idiosyncratic mention vectors, allowing us to obtain
further performance gains in property induction.
Related papers
- Local Topology Measures of Contextual Language Model Latent Spaces With Applications to Dialogue Term Extraction [4.887047578768969]
We introduce complexity measures of the local topology of the latent space of a contextual language model.
Our work continues a line of research that explores the manifold hypothesis for word embeddings.
arXiv Detail & Related papers (2024-08-07T11:44:32Z) - Exploring State Space and Reasoning by Elimination in Tsetlin Machines [14.150011713654331]
The Tsetlin Machine (TM) has gained significant attention in Machine Learning (ML)
TM is utilised to construct word embedding and describe target words using clauses.
To enhance the descriptive capacity of these clauses, we study the concept of Reasoning by Elimination (RbE) in clauses' formulation.
arXiv Detail & Related papers (2024-07-12T10:58:01Z) - Tsetlin Machine Embedding: Representing Words Using Logical Expressions [10.825099126920028]
We introduce a Tsetlin Machine-based autoencoder that learns logical clauses self-supervised.
The clauses consist of contextual words like "black," "cup," and "hot" to define other words like "coffee"
We evaluate our embedding approach on several intrinsic and extrinsic benchmarks, outperforming GLoVe on six classification tasks.
arXiv Detail & Related papers (2023-01-02T15:02:45Z) - Context-aware Fine-tuning of Self-supervised Speech Models [56.95389222319555]
We study the use of context, i.e., surrounding segments, during fine-tuning.
We propose a new approach called context-aware fine-tuning.
We evaluate the proposed approach using the SLUE and Libri-light benchmarks for several downstream tasks.
arXiv Detail & Related papers (2022-12-16T15:46:15Z) - Searching for Discriminative Words in Multidimensional Continuous
Feature Space [0.0]
We propose a novel method to extract discriminative keywords from documents.
We show how different discriminative metrics influence the overall results.
We conclude that word feature vectors can substantially improve the topical inference of documents' meaning.
arXiv Detail & Related papers (2022-11-26T18:05:11Z) - Compositional Generalization in Grounded Language Learning via Induced
Model Sparsity [81.38804205212425]
We consider simple language-conditioned navigation problems in a grid world environment with disentangled observations.
We design an agent that encourages sparse correlations between words in the instruction and attributes of objects, composing them together to find the goal.
Our agent maintains a high level of performance on goals containing novel combinations of properties even when learning from a handful of demonstrations.
arXiv Detail & Related papers (2022-07-06T08:46:27Z) - Graph Adaptive Semantic Transfer for Cross-domain Sentiment
Classification [68.06496970320595]
Cross-domain sentiment classification (CDSC) aims to use the transferable semantics learned from the source domain to predict the sentiment of reviews in the unlabeled target domain.
We present Graph Adaptive Semantic Transfer (GAST) model, an adaptive syntactic graph embedding method that is able to learn domain-invariant semantics from both word sequences and syntactic graphs.
arXiv Detail & Related papers (2022-05-18T07:47:01Z) - Deriving Word Vectors from Contextualized Language Models using
Topic-Aware Mention Selection [46.97185212695267]
We propose a method for learning word representations that follows this basic strategy.
We take advantage of contextualized language models (CLMs) rather than bags of word vectors to encode contexts.
We show that this simple strategy leads to high-quality word vectors, which are more predictive of semantic properties than word embeddings and existing CLM-based strategies.
arXiv Detail & Related papers (2021-06-15T08:02:42Z) - Infusing Finetuning with Semantic Dependencies [62.37697048781823]
We show that, unlike syntax, semantics is not brought to the surface by today's pretrained models.
We then use convolutional graph encoders to explicitly incorporate semantic parses into task-specific finetuning.
arXiv Detail & Related papers (2020-12-10T01:27:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.