Related papers: Are there identifiable structural parts in the sentence embedding whole?

Are there identifiable structural parts in the sentence embedding whole?

URL: http://arxiv.org/abs/2406.16563v2
Date: Tue, 2 Jul 2024 14:14:48 GMT
Title: Are there identifiable structural parts in the sentence embedding whole?
Authors: Vivi Nastase, Paola Merlo,
Abstract summary: Sentence embeddings from transformer models encode in a fixed length vector much linguistic information. We explore the hypothesis that these embeddings consist of overlapping layers of information that can be separated. We show that this is the case using a dataset consisting of sentences with known chunk structure, and two linguistic intelligence datasets.
Score: 1.6021932740447968
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Sentence embeddings from transformer models encode in a fixed length vector much linguistic information. We explore the hypothesis that these embeddings consist of overlapping layers of information that can be separated, and on which specific types of information -- such as information about chunks and their structural and semantic properties -- can be detected. We show that this is the case using a dataset consisting of sentences with known chunk structure, and two linguistic intelligence datasets, solving which relies on detecting chunks and their grammatical number, and respectively, their semantic roles, and through analyses of the performance on the tasks and of the internal representations built during learning.

Related papers

LayerFlow: Layer-wise Exploration of LLM Embeddings using Uncertainty-aware Interlinked Projections [11.252261879736102]
LayerFlow is a visual analytics workspace that displays embeddings in an interlinked projection design. It communicates the transformation, representation, and interpretation uncertainty. We show the usability of the presented workspace through replication and expert case studies.
arXiv Detail & Related papers (2025-04-09T12:24:58Z)
Tracking linguistic information in transformer-based sentence embeddings through targeted sparsification [1.6021932740447968]
Analyses of transformer-based models have shown that they encode a variety of linguistic information from their textual input. We test to what degree information about chunks (in particular noun, verb or prepositional phrases) can be localized in sentence embeddings. Our results show that such information is not distributed over the entire sentence embedding, but rather it is encoded in specific regions.
arXiv Detail & Related papers (2024-07-25T15:27:08Z)
Putting Context in Context: the Impact of Discussion Structure on Text Classification [13.15873889847739]
We propose a series of experiments on a large dataset for stance detection in English. We evaluate the contribution of different types of contextual information. We show that structural information can be highly beneficial to text classification but only under certain circumstances.
arXiv Detail & Related papers (2024-02-05T12:56:22Z)
How Do Transformers Learn Topic Structure: Towards a Mechanistic Understanding [56.222097640468306]
We provide mechanistic understanding of how transformers learn "semantic structure" We show, through a combination of mathematical analysis and experiments on Wikipedia data, that the embedding layer and the self-attention layer encode the topical structure.
arXiv Detail & Related papers (2023-03-07T21:42:17Z)
Linear Spaces of Meanings: Compositional Structures in Vision-Language Models [110.00434385712786]
We investigate compositional structures in data embeddings from pre-trained vision-language models (VLMs) We first present a framework for understanding compositional structures from a geometric perspective. We then explain what these structures entail probabilistically in the case of VLM embeddings, providing intuitions for why they arise in practice.
arXiv Detail & Related papers (2023-02-28T08:11:56Z)
A Knowledge-Enhanced Adversarial Model for Cross-lingual Structured Sentiment Analysis [31.05169054736711]
Cross-lingual structured sentiment analysis task aims to transfer the knowledge from source language to target one. We propose a Knowledge-Enhanced Adversarial Model (textttKEAM) with both implicit distributed and explicit structural knowledge. We conduct experiments on five datasets and compare textttKEAM with both the supervised and unsupervised methods.
arXiv Detail & Related papers (2022-05-31T03:07:51Z)
Low-Dimensional Structure in the Space of Language Representations is Reflected in Brain Responses [62.197912623223964]
We show a low-dimensional structure where language models and translation models smoothly interpolate between word embeddings, syntactic and semantic tasks, and future word embeddings. We find that this representation embedding can predict how well each individual feature space maps to human brain responses to natural language stimuli recorded using fMRI. This suggests that the embedding captures some part of the brain's natural language representation structure.
arXiv Detail & Related papers (2021-06-09T22:59:12Z)
Understanding Synonymous Referring Expressions via Contrastive Features [105.36814858748285]
We develop an end-to-end trainable framework to learn contrastive features on the image and object instance levels. We conduct extensive experiments to evaluate the proposed algorithm on several benchmark datasets.
arXiv Detail & Related papers (2021-04-20T17:56:24Z)
A Self-supervised Representation Learning of Sentence Structure for Authorship Attribution [3.5991811164452923]
We propose a self-supervised framework for learning structural representations of sentences. We evaluate the learned structural representations of sentences using different probing tasks, and subsequently utilize them in the authorship attribution task.
arXiv Detail & Related papers (2020-10-14T02:57:10Z)
Unsupervised Distillation of Syntactic Information from Contextualized Word Representations [62.230491683411536]
We tackle the task of unsupervised disentanglement between semantics and structure in neural language representations. To this end, we automatically generate groups of sentences which are structurally similar but semantically different. We demonstrate that our transformation clusters vectors in space by structural properties, rather than by lexical semantics.
arXiv Detail & Related papers (2020-10-11T15:13:18Z)
A Comparative Study on Structural and Semantic Properties of Sentence Embeddings [77.34726150561087]
We propose a set of experiments using a widely-used large-scale data set for relation extraction. We show that different embedding spaces have different degrees of strength for the structural and semantic properties. These results provide useful information for developing embedding-based relation extraction methods.
arXiv Detail & Related papers (2020-09-23T15:45:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.