Related papers: Semantic Structure in Large Language Model Embeddings

Semantic Structure in Large Language Model Embeddings

URL: http://arxiv.org/abs/2508.10003v1
Date: Mon, 04 Aug 2025 20:21:50 GMT
Title: Semantic Structure in Large Language Model Embeddings
Authors: Austin C. Kozlowski, Callin Dai, Andrei Boutyline,
Abstract summary: Psychological research consistently finds that human ratings of words can be reduced to a low-dimensional form with relatively little information loss.<n>We show that the projections of words on semantic directions defined by antonym pairs correlate highly with human ratings.<n>We find that shifting tokens along one semantic direction causes off-target effects on geometrically aligned features proportional to their cosine similarity.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Psychological research consistently finds that human ratings of words across diverse semantic scales can be reduced to a low-dimensional form with relatively little information loss. We find that the semantic associations encoded in the embedding matrices of large language models (LLMs) exhibit a similar structure. We show that the projections of words on semantic directions defined by antonym pairs (e.g. kind - cruel) correlate highly with human ratings, and further find that these projections effectively reduce to a 3-dimensional subspace within LLM embeddings, closely resembling the patterns derived from human survey responses. Moreover, we find that shifting tokens along one semantic direction causes off-target effects on geometrically aligned features proportional to their cosine similarity. These findings suggest that semantic features are entangled within LLMs similarly to how they are interconnected in human language, and a great deal of semantic information, despite its apparent complexity, is surprisingly low-dimensional. Furthermore, accounting for this semantic structure may prove essential for avoiding unintended consequences when steering features.

Related papers

Differential syntactic and semantic encoding in LLMs [49.300174325011426]
We study how syntactic and semantic information is encoded in inner layer representations of Large Language Models (LLMs)<n>We find that the cross-layer encoding profiles of syntax and semantics are different, and that the two signals can to some extent be decoupled.
arXiv Detail & Related papers (2026-01-08T09:33:29Z)
Bridging the Semantic Gap for Categorical Data Clustering via Large Language Models [64.58262227709842]
ARISE (Attention-weighted Representation with Integrated Semantic Embeddings) is presented.<n>It builds semantic-aware representations that complement the metric space of categorical data for accurate clustering.<n>Experiments on eight benchmark datasets demonstrate consistent improvements over seven representative counterparts.
arXiv Detail & Related papers (2026-01-03T11:37:46Z)
Temporal Sparse Autoencoders: Leveraging the Sequential Nature of Language for Interpretability [31.30541946703775]
Translating internal representations and computations of models into concepts that humans can understand is a key goal of interpretability.<n>Recent dictionary learning methods such as Sparse Autoencoders provide a promising route to discover human-interpretable features.<n>But they exhibit a bias towards shallow, token-specific, or noisy features, such as "the phrase 'The' at the start of sentences"
arXiv Detail & Related papers (2025-10-30T17:59:30Z)
Discovering Semantic Subdimensions through Disentangled Conceptual Representations [38.66662397064128]
This paper proposes a novel framework to investigate the subdimensions underlying coarse-grained semantic dimensions.<n>We introduce a Disentangled Continuous Representation Model (DCSRM) that decomposes word embeddings from large language models into multiple sub-embeddings.<n>Using these sub-embeddings, we identify a set of interpretable semantic subdimensions.<n>Our work offers more fine-grained interpretable semantic subdimensions of conceptual meaning.
arXiv Detail & Related papers (2025-08-29T09:04:34Z)
Large Language Models Encode Semantics in Low-Dimensional Linear Subspaces [31.401762286885656]
Understanding the space geometry of large language models (LLMs) is key to interpreting their behavior and improving alignment.<n>baturayWe investigate what extent LLMs internally organize related to semantic understanding.
arXiv Detail & Related papers (2025-07-13T17:03:25Z)
From Tokens to Thoughts: How LLMs and Humans Trade Compression for Meaning [52.32745233116143]
Humans organize knowledge into compact categories through semantic compression.<n>Large Language Models (LLMs) demonstrate remarkable linguistic abilities.<n>But whether their internal representations strike a human-like trade-off between compression and semantic fidelity is unclear.
arXiv Detail & Related papers (2025-05-21T16:29:00Z)
From the New World of Word Embeddings: A Comparative Study of Small-World Lexico-Semantic Networks in LLMs [47.52062992606549]
Lexico-semantic networks represent words as nodes and their semantic relatedness as edges.<n>We construct lexico-semantic networks from the input embeddings of decoder-only large language models.<n>Our results show that these networks exhibit small-world properties, characterized by high clustering and short path lengths.
arXiv Detail & Related papers (2025-02-17T02:52:07Z)
A polar coordinate system represents syntax in large language models [12.244752597245645]
syntactic trees may also be effectively represented in the activations of large language models.<n>We introduce a 'Polar Probe' trained to read syntactic relations from both the distance and the direction between word embeddings.<n>Our approach reveals three main findings. First, our Polar Probe successfully recovers the type and direction of syntactic relations, and substantially outperforms the Structural Probe by nearly two folds.
arXiv Detail & Related papers (2024-12-07T07:37:20Z)
Semantic Loss Functions for Neuro-Symbolic Structured Prediction [74.18322585177832]
We discuss the semantic loss, which injects knowledge about such structure, defined symbolically, into training. It is agnostic to the arrangement of the symbols, and depends only on the semantics expressed thereby. It can be combined with both discriminative and generative neural models.
arXiv Detail & Related papers (2024-05-12T22:18:25Z)
Agentivit\`a e telicit\`a in GilBERTo: implicazioni cognitive [77.71680953280436]
The goal of this study is to investigate whether a Transformer-based neural language model infers lexical semantics. The semantic properties considered are telicity (also combined with definiteness) and agentivity.
arXiv Detail & Related papers (2023-07-06T10:52:22Z)
Syntax and Semantics Meet in the "Middle": Probing the Syntax-Semantics Interface of LMs Through Agentivity [68.8204255655161]
We present the semantic notion of agentivity as a case study for probing such interactions. This suggests LMs may potentially serve as more useful tools for linguistic annotation, theory testing, and discovery.
arXiv Detail & Related papers (2023-05-29T16:24:01Z)
Lexical semantics enhanced neural word embeddings [4.040491121427623]
hierarchy-fitting is a novel approach to modelling semantic similarity nuances inherently stored in the IS-A hierarchies. Results demonstrate the efficacy of hierarchy-fitting in specialising neural embeddings with semantic relations in late fusion.
arXiv Detail & Related papers (2022-10-03T08:10:23Z)
CLINE: Contrastive Learning with Semantic Negative Examples for Natural Language Understanding [35.003401250150034]
We propose Contrastive Learning with semantIc Negative Examples (CLINE) to improve robustness of pre-trained language models. CLINE constructs semantic negative examples unsupervised to improve the robustness under semantically adversarial attacking. Empirical results show that our approach yields substantial improvements on a range of sentiment analysis, reasoning, and reading comprehension tasks.
arXiv Detail & Related papers (2021-07-01T13:34:12Z)
Human Correspondence Consensus for 3D Object Semantic Understanding [56.34297279246823]
In this paper, we introduce a new dataset named CorresPondenceNet. Based on this dataset, we are able to learn dense semantic embeddings with a novel geodesic consistency loss. We show that CorresPondenceNet could not only boost fine-grained understanding of heterogeneous objects but also cross-object registration and partial object matching.
arXiv Detail & Related papers (2019-12-29T04:24:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.