Related papers: Deriving Contextualised Semantic Features from BERT (and Other Transformer Model) Embeddings

Deriving Contextualised Semantic Features from BERT (and Other Transformer Model) Embeddings

URL: http://arxiv.org/abs/2012.15353v1
Date: Wed, 30 Dec 2020 22:52:29 GMT
Title: Deriving Contextualised Semantic Features from BERT (and Other Transformer Model) Embeddings
Authors: Jacob Turton, David Vinson, Robert Elliott Smith
Abstract summary: This paper demonstrates that Binder features can be derived from the BERT embedding space. It provides contextualised Binder embeddings, which can aid in understanding semantic differences between words in context. It additionally provides insights into how semantic features are represented across the different layers of the BERT model.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Models based on the transformer architecture, such as BERT, have marked a crucial step forward in the field of Natural Language Processing. Importantly, they allow the creation of word embeddings that capture important semantic information about words in context. However, as single entities, these embeddings are difficult to interpret and the models used to create them have been described as opaque. Binder and colleagues proposed an intuitive embedding space where each dimension is based on one of 65 core semantic features. Unfortunately, the space only exists for a small dataset of 535 words, limiting its uses. Previous work (Utsumi, 2018, 2020, Turton, Vinson & Smith, 2020) has shown that Binder features can be derived from static embeddings and successfully extrapolated to a large new vocabulary. Taking the next step, this paper demonstrates that Binder features can be derived from the BERT embedding space. This provides contextualised Binder embeddings, which can aid in understanding semantic differences between words in context. It additionally provides insights into how semantic features are represented across the different layers of the BERT model.

Related papers

CALE : Concept-Aligned Embeddings for Both Within-Lemma and Inter-Lemma Sense Differentiation [0.0]
Lexical semantics is concerned with both the multiple senses a word can adopt in different contexts, and the semantic relations that exist between meanings of different words.<n>To investigate them, Contextualized Language Models are a valuable tool that provides context-sensitive representations.<n>We propose an extension, Concept Differentiation, to include inter-words scenarios.
arXiv Detail & Related papers (2025-08-06T14:43:22Z)
Making Pre-trained Language Models Great on Tabular Prediction [50.70574370855663]
The transferability of deep neural networks (DNNs) has made significant progress in image and language processing. We present TP-BERTa, a specifically pre-trained LM for tabular data prediction. A novel relative magnitude tokenization converts scalar numerical feature values to finely discrete, high-dimensional tokens, and an intra-feature attention approach integrates feature values with the corresponding feature names.
arXiv Detail & Related papers (2024-03-04T08:38:56Z)
Syntax and Semantics Meet in the "Middle": Probing the Syntax-Semantics Interface of LMs Through Agentivity [68.8204255655161]
We present the semantic notion of agentivity as a case study for probing such interactions. This suggests LMs may potentially serve as more useful tools for linguistic annotation, theory testing, and discovery.
arXiv Detail & Related papers (2023-05-29T16:24:01Z)
MarkBERT: Marking Word Boundaries Improves Chinese BERT [67.53732128091747]
MarkBERT keeps the vocabulary being Chinese characters and inserts boundary markers between contiguous words. Compared to previous word-based BERT models, MarkBERT achieves better accuracy on text classification, keyword recognition, and semantic similarity tasks.
arXiv Detail & Related papers (2022-03-12T08:43:06Z)
Low-Resource Task-Oriented Semantic Parsing via Intrinsic Modeling [65.51280121472146]
We exploit what we intrinsically know about ontology labels to build efficient semantic parsing models. Our model is highly efficient using a low-resource benchmark derived from TOPv2.
arXiv Detail & Related papers (2021-04-15T04:01:02Z)
SemGloVe: Semantic Co-occurrences for GloVe from BERT [55.420035541274444]
GloVe learns word embeddings by leveraging statistical information from word co-occurrence matrices. We propose SemGloVe, which distills semantic co-occurrences from BERT into static GloVe word embeddings.
arXiv Detail & Related papers (2020-12-30T15:38:26Z)
Improved Biomedical Word Embeddings in the Transformer Era [2.978663539080876]
We learn word and concept embeddings by first using the skip-gram method and further fine-tuning them with correlational information. We conduct evaluations of these tuned static embeddings using multiple datasets for word relatedness developed by previous efforts.
arXiv Detail & Related papers (2020-12-22T03:03:50Z)
Does BERT Understand Sentiment? Leveraging Comparisons Between Contextual and Non-Contextual Embeddings to Improve Aspect-Based Sentiment Models [0.0]
We show that training a comparison of a contextual embedding from BERT and a generic word embedding can be used to infer sentiment. We also show that if we finetune a subset of weights the model built on comparison of BERT and generic word embedding, it can get state of the art results for Polarity Detection in Aspect Based Sentiment Classification datasets.
arXiv Detail & Related papers (2020-11-23T19:12:31Z)
Semantic Labeling Using a Deep Contextualized Language Model [9.719972529205101]
We propose a context-aware semantic labeling method using both the column values and context. Our new method is based on a new setting for semantic labeling, where we sequentially predict labels for an input table with missing headers. To our knowledge, we are the first to successfully apply BERT to solve the semantic labeling task.
arXiv Detail & Related papers (2020-10-30T03:04:22Z)
CharacterBERT: Reconciling ELMo and BERT for Word-Level Open-Vocabulary Representations From Characters [14.956626084281638]
We propose a new variant of BERT that drops the wordpiece system altogether and uses a Character-CNN module instead to represent entire words by consulting their characters. We show that this new model improves the performance of BERT on a variety of medical domain tasks while at the same time producing robust, word-level and open-vocabulary representations.
arXiv Detail & Related papers (2020-10-20T15:58:53Z)
LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention [37.111204321059084]
We propose new pretrained contextualized representations of words and entities based on the bidirectional transformer. Our model is trained using a new pretraining task based on the masked language model of BERT. We also propose an entity-aware self-attention mechanism that is an extension of the self-attention mechanism of the transformer.
arXiv Detail & Related papers (2020-10-02T15:38:03Z)
A Comparative Study on Structural and Semantic Properties of Sentence Embeddings [77.34726150561087]
We propose a set of experiments using a widely-used large-scale data set for relation extraction. We show that different embedding spaces have different degrees of strength for the structural and semantic properties. These results provide useful information for developing embedding-based relation extraction methods.
arXiv Detail & Related papers (2020-09-23T15:45:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.