A Comparative Study on Structural and Semantic Properties of Sentence
Embeddings
- URL: http://arxiv.org/abs/2009.11226v1
- Date: Wed, 23 Sep 2020 15:45:32 GMT
- Title: A Comparative Study on Structural and Semantic Properties of Sentence
Embeddings
- Authors: Alexander Kalinowski and Yuan An
- Abstract summary: We propose a set of experiments using a widely-used large-scale data set for relation extraction.
We show that different embedding spaces have different degrees of strength for the structural and semantic properties.
These results provide useful information for developing embedding-based relation extraction methods.
- Score: 77.34726150561087
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Sentence embeddings encode natural language sentences as low-dimensional
dense vectors. A great deal of effort has been put into using sentence
embeddings to improve several important natural language processing tasks.
Relation extraction is such an NLP task that aims at identifying structured
relations defined in a knowledge base from unstructured text. A promising and
more efficient approach would be to embed both the text and structured
knowledge in low-dimensional spaces and discover semantic alignments or
mappings between them. Although a number of techniques have been proposed in
the literature for embedding both sentences and knowledge graphs, little is
known about the structural and semantic properties of these embedding spaces in
terms of relation extraction. In this paper, we investigate the aforementioned
properties by evaluating the extent to which sentences carrying similar senses
are embedded in close proximity sub-spaces, and if we can exploit that
structure to align sentences to a knowledge graph. We propose a set of
experiments using a widely-used large-scale data set for relation extraction
and focusing on a set of key sentence embedding methods. We additionally
provide the code for reproducing these experiments at
https://github.com/akalino/semantic-structural-sentences. These embedding
methods cover a wide variety of techniques ranging from simple word embedding
combination to transformer-based BERT-style model. Our experimental results
show that different embedding spaces have different degrees of strength for the
structural and semantic properties. These results provide useful information
for developing embedding-based relation extraction methods.
Related papers
- Bridging Continuous and Discrete Spaces: Interpretable Sentence
Representation Learning via Compositional Operations [80.45474362071236]
It is unclear whether the compositional semantics of sentences can be directly reflected as compositional operations in the embedding space.
We propose InterSent, an end-to-end framework for learning interpretable sentence embeddings.
arXiv Detail & Related papers (2023-05-24T00:44:49Z) - Word Sense Induction with Knowledge Distillation from BERT [6.88247391730482]
This paper proposes a method to distill multiple word senses from a pre-trained language model (BERT) by using attention over the senses of a word in a context.
Experiments on the contextual word similarity and sense induction tasks show that this method is superior to or competitive with state-of-the-art multi-sense embeddings.
arXiv Detail & Related papers (2023-04-20T21:05:35Z) - Relational Sentence Embedding for Flexible Semantic Matching [86.21393054423355]
We present Sentence Embedding (RSE), a new paradigm to discover further the potential of sentence embeddings.
RSE is effective and flexible in modeling sentence relations and outperforms a series of state-of-the-art embedding methods.
arXiv Detail & Related papers (2022-12-17T05:25:17Z) - Textual Entailment Recognition with Semantic Features from Empirical
Text Representation [60.31047947815282]
A text entails a hypothesis if and only if the true value of the hypothesis follows the text.
In this paper, we propose a novel approach to identifying the textual entailment relationship between text and hypothesis.
We employ an element-wise Manhattan distance vector-based feature that can identify the semantic entailment relationship between the text-hypothesis pair.
arXiv Detail & Related papers (2022-10-18T10:03:51Z) - Clustering and Network Analysis for the Embedding Spaces of Sentences
and Sub-Sentences [69.3939291118954]
This paper reports research on a set of comprehensive clustering and network analyses targeting sentence and sub-sentence embedding spaces.
Results show that one method generates the most clusterable embeddings.
In general, the embeddings of span sub-sentences have better clustering properties than the original sentences.
arXiv Detail & Related papers (2021-10-02T00:47:35Z) - Imposing Relation Structure in Language-Model Embeddings Using
Contrastive Learning [30.00047118880045]
We propose a novel contrastive learning framework that trains sentence embeddings to encode the relations in a graph structure.
The resulting relation-aware sentence embeddings achieve state-of-the-art results on the relation extraction task.
arXiv Detail & Related papers (2021-09-02T10:58:27Z) - A Self-supervised Representation Learning of Sentence Structure for
Authorship Attribution [3.5991811164452923]
We propose a self-supervised framework for learning structural representations of sentences.
We evaluate the learned structural representations of sentences using different probing tasks, and subsequently utilize them in the authorship attribution task.
arXiv Detail & Related papers (2020-10-14T02:57:10Z) - Intrinsic Probing through Dimension Selection [69.52439198455438]
Most modern NLP systems make use of pre-trained contextual representations that attain astonishingly high performance on a variety of tasks.
Such high performance should not be possible unless some form of linguistic structure inheres in these representations, and a wealth of research has sprung up on probing for it.
In this paper, we draw a distinction between intrinsic probing, which examines how linguistic information is structured within a representation, and the extrinsic probing popular in prior work, which only argues for the presence of such information by showing that it can be successfully extracted.
arXiv Detail & Related papers (2020-10-06T15:21:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.