Related papers: The Information Geometry of Softmax: Probing and Steering

The Information Geometry of Softmax: Probing and Steering

URL: http://arxiv.org/abs/2602.15293v1
Date: Tue, 17 Feb 2026 01:33:28 GMT
Title: The Information Geometry of Softmax: Probing and Steering
Authors: Kiho Park, Todd Nief, Yo Joong Choe, Victor Veitch,
Abstract summary: We argue that the natural geometry of representation spaces should reflect the way models use representations to produce behavior.<n>Our focus is on the role of information geometry on semantic encoding and the linear representation hypothesis.<n>As an illustrative application, we develop "dual steering", a method for robustly steering representations to exhibit a particular concept.
Score: 18.006877307358348
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper concerns the question of how AI systems encode semantic structure into the geometric structure of their representation spaces. The motivating observation of this paper is that the natural geometry of these representation spaces should reflect the way models use representations to produce behavior. We focus on the important special case of representations that define softmax distributions. In this case, we argue that the natural geometry is information geometry. Our focus is on the role of information geometry on semantic encoding and the linear representation hypothesis. As an illustrative application, we develop "dual steering", a method for robustly steering representations to exhibit a particular concept using linear probes. We prove that dual steering optimally modifies the target concept while minimizing changes to off-target concepts. Empirically, we find that dual steering enhances the controllability and stability of concept manipulation.

Related papers

Brep2Shape: Boundary and Shape Representation Alignment via Self-Supervised Transformers [46.87466345672103]
Boundary representation (B-rep) is the industry standard for computer-aided design (CAD)<n>While deep learning shows promise in processing B-rep models, existing methods suffer from a representation gap.<n>We introduce Brep2Shape, a novel self-supervised pre-training method designed to align abstract boundary representations with intuitive shape representations.
arXiv Detail & Related papers (2026-02-07T08:00:47Z)
Curved Inference: Concern-Sensitive Geometry in Large Language Model Residual Streams [0.0]
We propose a geometric Interpretability framework that tracks how the residual stream trajectory of a large language model bends in response to shifts in semantic concern.<n>We analyse Gemma3-1b and LLaMA3.2-3b using five native-space metrics, with a primary focus on curvature (kappa_i) and salience (S(t))<n>We find that concern-shifted prompts reliably alter internal activation trajectories in both models.
arXiv Detail & Related papers (2025-07-08T23:05:00Z)
Cross-Modal Geometric Hierarchy Fusion: An Implicit-Submap Driven Framework for Resilient 3D Place Recognition [9.411542547451193]
We propose a novel framework that redefines 3D place recognition through density-agnostic geometric reasoning.<n>Specifically, we introduce an implicit 3D representation based on elastic points, which is immune to the interference of original scene point cloud density.<n>With the aid of these two types of information, we obtain descriptors that fuse geometric information from both bird's-eye view and 3D segment perspectives.
arXiv Detail & Related papers (2025-06-17T07:04:07Z)
Geometry matters: insights from Ollivier Ricci Curvature and Ricci Flow into representational alignment through Ollivier-Ricci Curvature and Ricci Flow [0.014893065504013906]
This work introduces a framework using Ollivier Ricci Curvature and Ricci Flow to analyze the fine-grained local structure of representations.<n>We apply it to compare human similarity judgments for 2D and 3D face stimuli with a baseline 2D native network (VGG-Face) and a variant of it aligned to human behavior.
arXiv Detail & Related papers (2025-01-01T18:33:48Z)
A Geometry-Aware Message Passing Neural Network for Modeling Aerodynamics over Airfoils [61.60175086194333]
aerodynamics is a key problem in aerospace engineering, often involving flows interacting with solid objects such as airfoils.<n>Here, we consider modeling of incompressible flows over solid objects, wherein geometric structures are a key factor in determining aerodynamics.<n>To effectively incorporate geometries, we propose a message passing scheme that efficiently and expressively integrates the airfoil shape with the mesh representation.<n>These design choices lead to a purely data-driven machine learning framework known as GeoMPNN, which won the Best Student Submission award at the NeurIPS 2024 ML4CFD Competition, placing 4th overall.
arXiv Detail & Related papers (2024-12-12T16:05:39Z)
Geometry Distributions [51.4061133324376]
We propose a novel geometric data representation that models geometry as distributions. Our approach uses diffusion models with a novel network architecture to learn surface point distributions. We evaluate our representation qualitatively and quantitatively across various object types, demonstrating its effectiveness in achieving high geometric fidelity.
arXiv Detail & Related papers (2024-11-25T04:06:48Z)
Geometric Methods for Sampling, Optimisation, Inference and Adaptive Agents [102.42623636238399]
We identify fundamental geometric structures that underlie the problems of sampling, optimisation, inference and adaptive decision-making. We derive algorithms that exploit these geometric structures to solve these problems efficiently.
arXiv Detail & Related papers (2022-03-20T16:23:17Z)
DeepMLS: Geometry-Aware Control Point Deformation [76.51312491336343]
We introduce DeepMLS, a space-based deformation technique, guided by a set of displaced control points. We leverage the power of neural networks to inject the underlying shape geometry into the deformation parameters. Our technique facilitates intuitive piecewise smooth deformations, which are well suited for manufactured objects.
arXiv Detail & Related papers (2022-01-05T23:55:34Z)
Self-supervised Geometric Perception [96.89966337518854]
Self-supervised geometric perception is a framework to learn a feature descriptor for correspondence matching without any ground-truth geometric model labels. We show that SGP achieves state-of-the-art performance that is on-par or superior to the supervised oracles trained using ground-truth labels.
arXiv Detail & Related papers (2021-03-04T15:34:43Z)
The Geometry of Deep Generative Image Models and its Applications [0.0]
Generative adversarial networks (GANs) have emerged as a powerful unsupervised method to model the statistical patterns of real-world data sets. These networks are trained to map random inputs in their latent space to new samples representative of the learned data. The structure of the latent space is hard to intuit due to its high dimensionality and the non-linearity of the generator.
arXiv Detail & Related papers (2021-01-15T07:57:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.