Related papers: Curved Inference: Concern-Sensitive Geometry in Large Language Model Residual Streams

Curved Inference: Concern-Sensitive Geometry in Large Language Model Residual Streams

URL: http://arxiv.org/abs/2507.21107v1
Date: Tue, 08 Jul 2025 23:05:00 GMT
Title: Curved Inference: Concern-Sensitive Geometry in Large Language Model Residual Streams
Authors: Rob Manson,
Abstract summary: We propose a geometric Interpretability framework that tracks how the residual stream trajectory of a large language model bends in response to shifts in semantic concern.<n>We analyse Gemma3-1b and LLaMA3.2-3b using five native-space metrics, with a primary focus on curvature (kappa_i) and salience (S(t))<n>We find that concern-shifted prompts reliably alter internal activation trajectories in both models.
Score: 0.0
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: We propose Curved Inference - a geometric Interpretability framework that tracks how the residual stream trajectory of a large language model bends in response to shifts in semantic concern. Across 20 matched prompts spanning emotional, moral, perspective, logical, identity, environmental, and nonsense domains, we analyse Gemma3-1b and LLaMA3.2-3b using five native-space metrics, with a primary focus on curvature (\k{appa}_i) and salience (S(t)). These metrics are computed under a pullback semantic metric derived from the unembedding matrix, ensuring that all measurements reflect token-aligned geometry rather than raw coordinate structure. We find that concern-shifted prompts reliably alter internal activation trajectories in both models - with LLaMA exhibiting consistent, statistically significant scaling in both curvature and salience as concern intensity increases. Gemma also responds to concern but shows weaker differentiation between moderate and strong variants. Our results support a two-layer view of LLM geometry - a latent conceptual structure encoded in the embedding space, and a contextual trajectory shaped by prompt-specific inference. Curved Inference reveals how models navigate, reorient, or reinforce semantic meaning over depth, offering a principled method for diagnosing alignment, abstraction, and emergent inference dynamics. These findings offer fresh insight into semantic abstraction and model alignment through the lens of Curved Inference.

Related papers

Brep2Shape: Boundary and Shape Representation Alignment via Self-Supervised Transformers [46.87466345672103]
Boundary representation (B-rep) is the industry standard for computer-aided design (CAD)<n>While deep learning shows promise in processing B-rep models, existing methods suffer from a representation gap.<n>We introduce Brep2Shape, a novel self-supervised pre-training method designed to align abstract boundary representations with intuitive shape representations.
arXiv Detail & Related papers (2026-02-07T08:00:47Z)
Simulated Adoption: Decoupling Magnitude and Direction in LLM In-Context Conflict Resolution [3.0242762196828448]
Large Language Models (LLMs) frequently prioritize conflicting in-context information over pre-existing parametric memory.<n>We show that models do not "unlearn" or suppress the magnitude of internal truths but rather employ a mechanism of geometric displacement.
arXiv Detail & Related papers (2026-02-04T06:13:11Z)
Riemannian Flow Matching for Disentangled Graph Domain Adaptation [51.98961391065951]
Graph Domain Adaptation (GDA) typically uses adversarial learning to align graph embeddings in Euclidean space.<n>DisRFM is a geometry-aware GDA framework that unifies embedding and flow-based transport.
arXiv Detail & Related papers (2026-01-31T11:05:35Z)
Gauge-invariant representation holonomy [1.078600700827543]
Deep networks learn internal representations whose geometry--how features bend, rotate, and evolve--affects both generalization and robustness.<n>Existing similarity measures such as CKA or SVCCA capture pointwise overlap between activation sets, but miss how representations change along input paths.<n>We introduce representation holonomy, a gauge-invariant statistic that measures this path dependence.
arXiv Detail & Related papers (2026-01-29T12:51:17Z)
TangramPuzzle: Evaluating Multimodal Large Language Models with Compositional Spatial Reasoning [104.66714520975837]
We introduce a geometry-grounded benchmark designed to evaluate compositional spatial reasoning through the lens of the classic Tangram game.<n>We propose the Tangram Construction Expression (TCE), a symbolic geometric framework that grounds tangram assemblies in exact, machine-verifiable coordinate specifications.<n>We conduct extensive evaluation experiments on advanced open-source and proprietary models, revealing an interesting insight: MLLMs tend to prioritize matching the target silhouette while neglecting geometric constraints.
arXiv Detail & Related papers (2026-01-23T07:35:05Z)
Causal Manifold Fairness: Enforcing Geometric Invariance in Representation Learning [0.0]
We introduce Causal Manifold Fairness (CMF), a novel framework that bridges causal inference and geometric deep learning.<n>By enforcing constraints on the Jacobian and Hessian of the decoder, CMF ensures that the rules of the latent space are preserved across demographic groups.<n>We validate CMF on synthetic Structural Causal Models (SCMs), demonstrating that it effectively disentangles sensitive geometric warping while preserving task utility.
arXiv Detail & Related papers (2026-01-06T14:05:22Z)
GeoGNN: Quantifying and Mitigating Semantic Drift in Text-Attributed Graphs [59.61242815508687]
Graph neural networks (GNNs) on text--attributed graphs (TAGs) encode node texts using pretrained language models (PLMs) and propagate these embeddings through linear neighborhood aggregation.<n>This work introduces a local PCA-based metric that measures the degree of semantic drift and provides the first quantitative framework to analyze how different aggregation mechanisms affect manifold structure.
arXiv Detail & Related papers (2025-11-12T06:48:43Z)
The Curved Spacetime of Transformer Architectures [0.3670422696827525]
We present a geometric framework for understanding Transformer-based language models, drawing an explicit analogy to General Relativity.<n>We show that token embeddings should not traverse straight paths in feature space; instead, their layer-wise steps should bend and reorient as interactions mediated by embedding space curvature.
arXiv Detail & Related papers (2025-11-04T22:58:40Z)
Modality Alignment across Trees on Heterogeneous Hyperbolic Manifolds [49.95082206008502]
Alignment across Trees is a method that constructs and aligns tree-like hierarchical features for both image and text modalities.<n>We introduce a semantic-aware visual feature extraction framework that applies a cross-attention mechanism to visual class tokens from intermediate Transformer layers.
arXiv Detail & Related papers (2025-10-31T11:32:15Z)
Large Language Models Encode Semantics in Low-Dimensional Linear Subspaces [31.401762286885656]
Understanding the space geometry of large language models (LLMs) is key to interpreting their behavior and improving alignment.<n>baturayWe investigate what extent LLMs internally organize related to semantic understanding.
arXiv Detail & Related papers (2025-07-13T17:03:25Z)
Cross-Modal Geometric Hierarchy Fusion: An Implicit-Submap Driven Framework for Resilient 3D Place Recognition [4.196626042312499]
We propose a novel framework that redefines 3D place recognition through density-agnostic geometric reasoning.<n>Specifically, we introduce an implicit 3D representation based on elastic points, which is immune to the interference of original scene point cloud density.<n>With the aid of these two types of information, we obtain descriptors that fuse geometric information from both bird's-eye view and 3D segment perspectives.
arXiv Detail & Related papers (2025-06-17T07:04:07Z)
Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene Understanding [58.38294408121273]
We propose Cross-modal and Uncertainty-aware Agglomeration for Open-vocabulary 3D Scene Understanding dubbed CUA-O3D.<n>Our method addresses two key challenges: (1) incorporating semantic priors from VLMs alongside the geometric knowledge of spatially-aware vision foundation models, and (2) using a novel deterministic uncertainty estimation to capture model-specific uncertainties.
arXiv Detail & Related papers (2025-03-20T20:58:48Z)
Relative Representations: Topological and Geometric Perspectives [53.88896255693922]
Relative representations are an established approach to zero-shot model stitching.<n>We introduce a normalization procedure in the relative transformation, resulting in invariance to non-isotropic rescalings and permutations.<n>Second, we propose to deploy topological densification when fine-tuning relative representations, a topological regularization loss encouraging clustering within classes.
arXiv Detail & Related papers (2024-09-17T08:09:22Z)
Landscaping Linear Mode Connectivity [76.39694196535996]
linear mode connectivity (LMC) has garnered interest from both theoretical and practical fronts. We take a step towards understanding it by providing a model of how the loss landscape needs to behave topographically for LMC.
arXiv Detail & Related papers (2024-06-24T03:53:30Z)
Understanding Probe Behaviors through Variational Bounds of Mutual Information [53.520525292756005]
We provide guidelines for linear probing by constructing a novel mathematical framework leveraging information theory. First, we connect probing with the variational bounds of mutual information (MI) to relax the probe design, equating linear probing with fine-tuning. We show that the intermediate representations can have the biggest MI estimate because of the tradeoff between better separability and decreasing MI.
arXiv Detail & Related papers (2023-12-15T18:38:18Z)
Understanding and Mitigating Hyperbolic Dimensional Collapse in Graph Contrastive Learning [70.0681902472251]
We propose a novel contrastive learning framework to learn high-quality graph embeddings in hyperbolic space.<n>Specifically, we design the alignment metric that effectively captures the hierarchical data-invariant information.<n>We show that in the hyperbolic space one has to address the leaf- and height-level uniformity related to properties of trees.
arXiv Detail & Related papers (2023-10-27T15:31:42Z)
Curved Geometric Networks for Visual Anomaly Recognition [39.91252195360767]
Learning a latent embedding to understand the underlying nature of data distribution is often formulated in Euclidean spaces with zero curvature. In this work, we investigate benefits of the curved space for analyzing anomalies or out-of-distribution objects in data.
arXiv Detail & Related papers (2022-08-02T01:15:39Z)
Self-supervised Geometric Perception [96.89966337518854]
Self-supervised geometric perception is a framework to learn a feature descriptor for correspondence matching without any ground-truth geometric model labels. We show that SGP achieves state-of-the-art performance that is on-par or superior to the supervised oracles trained using ground-truth labels.
arXiv Detail & Related papers (2021-03-04T15:34:43Z)
GELATO: Geometrically Enriched Latent Model for Offline Reinforcement Learning [54.291331971813364]
offline reinforcement learning approaches can be divided into proximal and uncertainty-aware methods. In this work, we demonstrate the benefit of combining the two in a latent variational model. Our proposed metrics measure both the quality of out of distribution samples as well as the discrepancy of examples in the data.
arXiv Detail & Related papers (2021-02-22T19:42:40Z)
Identifying the latent space geometry of network models through analysis of curvature [7.644165047073435]
We present a method to consistently estimate the manifold type, dimension, and curvature from an empirically relevant class of latent spaces. Our core insight comes by representing the graph as a noisy distance matrix based on the ties between cliques.
arXiv Detail & Related papers (2020-12-19T00:35:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.