Related papers: Mechanistic Decomposition of Sentence Representations

Mechanistic Decomposition of Sentence Representations

URL: http://arxiv.org/abs/2506.04373v2
Date: Tue, 10 Jun 2025 17:05:41 GMT
Title: Mechanistic Decomposition of Sentence Representations
Authors: Matthieu Tehenan, Vikram Natarajan, Jonathan Michala, Milton Lin, Juri Opitz,
Abstract summary: Sentence embeddings are central to modern NLP and AI systems, but little is known about their internal structure.<n>We propose a new method to mechanistically decompose sentence embeddings into interpretable components.<n>We analyze how pooling compresses these features into sentence representations, and assess the latent features that reside in a sentence embedding.
Score: 3.9146761527401432
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Sentence embeddings are central to modern NLP and AI systems, yet little is known about their internal structure. While we can compare these embeddings using measures such as cosine similarity, the contributing features are not human-interpretable, and the content of an embedding seems untraceable, as it is masked by complex neural transformations and a final pooling operation that combines individual token embeddings. To alleviate this issue, we propose a new method to mechanistically decompose sentence embeddings into interpretable components, by using dictionary learning on token-level representations. We analyze how pooling compresses these features into sentence representations, and assess the latent features that reside in a sentence embedding. This bridges token-level mechanistic interpretability with sentence-level analysis, making for more transparent and controllable representations. In our studies, we obtain several interesting insights into the inner workings of sentence embedding spaces, for instance, that many semantic and syntactic aspects are linearly encoded in the embeddings.

Related papers

On Self-improving Token Embeddings [0.0]
Article introduces a novel and fast method for refining pre-trained static word or, more generally, token embeddings.<n>It continuously updates the representation of each token, including those without pre-assigned embeddings.<n> operating independently of large language models and shallow neural networks.
arXiv Detail & Related papers (2025-04-21T02:17:19Z)
Sub-Sentence Encoder: Contrastive Learning of Propositional Semantic Representations [102.05351905494277]
Sub-sentence encoder is a contrastively-learned contextual embedding model for fine-grained semantic representation of text. We show that sub-sentence encoders keep the same level of inference cost and space complexity compared to sentence encoders.
arXiv Detail & Related papers (2023-11-07T20:38:30Z)
Bridging Continuous and Discrete Spaces: Interpretable Sentence Representation Learning via Compositional Operations [80.45474362071236]
It is unclear whether the compositional semantics of sentences can be directly reflected as compositional operations in the embedding space. We propose InterSent, an end-to-end framework for learning interpretable sentence embeddings.
arXiv Detail & Related papers (2023-05-24T00:44:49Z)
Relational Sentence Embedding for Flexible Semantic Matching [86.21393054423355]
We present Sentence Embedding (RSE), a new paradigm to discover further the potential of sentence embeddings. RSE is effective and flexible in modeling sentence relations and outperforms a series of state-of-the-art embedding methods.
arXiv Detail & Related papers (2022-12-17T05:25:17Z)
A Sentence is Worth 128 Pseudo Tokens: A Semantic-Aware Contrastive Learning Framework for Sentence Embeddings [28.046786376565123]
We propose a semantics-aware contrastive learning framework for sentence embeddings, termed Pseudo-Token BERT (PT-BERT) We exploit the pseudo-token space (i.e., latent semantic space) representation of a sentence while eliminating the impact of superficial features such as sentence length and syntax. Our model outperforms the state-of-the-art baselines on six standard semantic textual similarity (STS) tasks.
arXiv Detail & Related papers (2022-03-11T12:29:22Z)
Clustering and Network Analysis for the Embedding Spaces of Sentences and Sub-Sentences [69.3939291118954]
This paper reports research on a set of comprehensive clustering and network analyses targeting sentence and sub-sentence embedding spaces. Results show that one method generates the most clusterable embeddings. In general, the embeddings of span sub-sentences have better clustering properties than the original sentences.
arXiv Detail & Related papers (2021-10-02T00:47:35Z)
Syntactic Perturbations Reveal Representational Correlates of Hierarchical Phrase Structure in Pretrained Language Models [22.43510769150502]
It is not entirely clear what aspects of sentence-level syntax are captured by vector-based language representations. We show that Transformers build sensitivity to larger parts of the sentence along their layers, and that hierarchical phrase structure plays a role in this process.
arXiv Detail & Related papers (2021-04-15T16:30:31Z)
Unsupervised Distillation of Syntactic Information from Contextualized Word Representations [62.230491683411536]
We tackle the task of unsupervised disentanglement between semantics and structure in neural language representations. To this end, we automatically generate groups of sentences which are structurally similar but semantically different. We demonstrate that our transformation clusters vectors in space by structural properties, rather than by lexical semantics.
arXiv Detail & Related papers (2020-10-11T15:13:18Z)
A Comparative Study on Structural and Semantic Properties of Sentence Embeddings [77.34726150561087]
We propose a set of experiments using a widely-used large-scale data set for relation extraction. We show that different embedding spaces have different degrees of strength for the structural and semantic properties. These results provide useful information for developing embedding-based relation extraction methods.
arXiv Detail & Related papers (2020-09-23T15:45:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.