Into the Rabbit Hull: From Task-Relevant Concepts in DINO to Minkowski Geometry
- URL: http://arxiv.org/abs/2510.08638v1
- Date: Wed, 08 Oct 2025 22:42:20 GMT
- Title: Into the Rabbit Hull: From Task-Relevant Concepts in DINO to Minkowski Geometry
- Authors: Thomas Fel, Binxu Wang, Michael A. Lepori, Matthew Kowal, Andrew Lee, Randall Balestriero, Sonia Joseph, Ekdeep S. Lubana, Talia Konkle, Demba Ba, Martin Wattenberg,
- Abstract summary: DINOv2 is routinely deployed to recognize objects, scenes, and actions; yet the nature of what it perceives remains unknown.<n>As a working baseline, we adopt the Linear Representation Hypothesis (LRH) and operationalize it using SAEs.<n>We produce a 32,000-unit dictionary that serves as the interpretability backbone of our study.
- Score: 31.26429968473424
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: DINOv2 is routinely deployed to recognize objects, scenes, and actions; yet the nature of what it perceives remains unknown. As a working baseline, we adopt the Linear Representation Hypothesis (LRH) and operationalize it using SAEs, producing a 32,000-unit dictionary that serves as the interpretability backbone of our study, which unfolds in three parts. In the first part, we analyze how different downstream tasks recruit concepts from our learned dictionary, revealing functional specialization: classification exploits "Elsewhere" concepts that fire everywhere except on target objects, implementing learned negations; segmentation relies on boundary detectors forming coherent subspaces; depth estimation draws on three distinct monocular depth cues matching visual neuroscience principles. Following these functional results, we analyze the geometry and statistics of the concepts learned by the SAE. We found that representations are partly dense rather than strictly sparse. The dictionary evolves toward greater coherence and departs from maximally orthogonal ideals (Grassmannian frames). Within an image, tokens occupy a low dimensional, locally connected set persisting after removing position. These signs suggest representations are organized beyond linear sparsity alone. Synthesizing these observations, we propose a refined view: tokens are formed by combining convex mixtures of archetypes (e.g., a rabbit among animals, brown among colors, fluffy among textures). This structure is grounded in Gardenfors' conceptual spaces and in the model's mechanism as multi-head attention produces sums of convex mixtures, defining regions bounded by archetypes. We introduce the Minkowski Representation Hypothesis (MRH) and examine its empirical signatures and implications for interpreting vision-transformer representations.
Related papers
- The Origins of Representation Manifolds in Large Language Models [52.68554895844062]
We show that cosine similarity in representation space may encode the intrinsic geometry of a feature through shortest, on-manifold paths.<n>The critical assumptions and predictions of the theory are validated on text embeddings and token activations of large language models.
arXiv Detail & Related papers (2025-05-23T13:31:22Z) - Concept-Guided Interpretability via Neural Chunking [64.6429903327095]
We show that neural networks exhibit patterns in their raw population activity that mirror regularities in the training data.<n>We propose three methods to extract recurring chunks on a neural population level.<n>Our work points to a new direction for interpretability, one that harnesses both cognitive principles and the structure of naturalistic data.
arXiv Detail & Related papers (2025-05-16T13:49:43Z) - Understanding In-context Learning of Addition via Activation Subspaces [73.8295576941241]
We study a structured family of few-shot learning tasks for which the true prediction rule is to add an integer $k$ to the input.<n>We then perform an in-depth analysis of individual heads, via dimensionality reduction and decomposition.<n>Our results demonstrate how tracking low-dimensional subspaces of localized heads across a forward pass can provide insight into fine-grained computational structures in language models.
arXiv Detail & Related papers (2025-05-08T11:32:46Z) - The Geometry of Categorical and Hierarchical Concepts in Large Language Models [15.126806053878855]
We show how to extend the formalization of the linear representation hypothesis to represent features (e.g., is_animal) as vectors.<n>We use the formalization to prove a relationship between the hierarchical structure of concepts and the geometry of their representations.<n>We validate these theoretical results on the Gemma and LLaMA-3 large language models, estimating representations for 900+ hierarchically related concepts using data from WordNet.
arXiv Detail & Related papers (2024-06-03T16:34:01Z) - Explaining Explainability: Recommendations for Effective Use of Concept Activation Vectors [35.37586279472797]
Concept Vector Activations (CAVs) are learnt using a probe dataset of concept exemplars.<n>We investigate three properties of CAVs: inconsistency across layers, (2) entanglement with other concepts, and (3) spatial dependency.<n>We introduce tools designed to detect the presence of these properties, provide insight into how each property can lead to misleading explanations, and provide recommendations to mitigate their impact.
arXiv Detail & Related papers (2024-04-04T17:46:20Z) - On convex decision regions in deep network representations [1.06378109904813]
We investigate the notion of convexity of concept regions in machine-learned latent spaces.
We show that convexity is robust to basic re-parametrization.
We find that approximate convexity is pervasive in neural representations in multiple application domains.
arXiv Detail & Related papers (2023-05-26T10:33:03Z) - Zero-shot point cloud segmentation by transferring geometric primitives [68.18710039217336]
We investigate zero-shot point cloud semantic segmentation, where the network is trained on seen objects and able to segment unseen objects.
We propose a novel framework to learn the geometric primitives shared in seen and unseen categories' objects and employ a fine-grained alignment between language and the learned geometric primitives.
arXiv Detail & Related papers (2022-10-18T15:06:54Z) - 3D Concept Grounding on Neural Fields [99.33215488324238]
Existing visual reasoning approaches typically utilize supervised methods to extract 2D segmentation masks on which concepts are grounded.
Humans are capable of grounding concepts on the underlying 3D representation of images.
We propose to leverage the continuous, differentiable nature of neural fields to segment and learn concepts.
arXiv Detail & Related papers (2022-07-13T17:59:33Z) - Unsupervised Part Discovery from Contrastive Reconstruction [90.88501867321573]
The goal of self-supervised visual representation learning is to learn strong, transferable image representations.
We propose an unsupervised approach to object part discovery and segmentation.
Our method yields semantic parts consistent across fine-grained but visually distinct categories.
arXiv Detail & Related papers (2021-11-11T17:59:42Z) - Points2Vec: Unsupervised Object-level Feature Learning from Point Clouds [25.988556827312483]
Similar representation learning techniques have not yet become commonplace in the context of 3D vision.
We learn these vector representations by mining a dataset of scanned 3D spaces using an unsupervised algorithm.
We show that using our method to include context increases the ability of a clustering algorithm to distinguish different semantic classes from each other.
arXiv Detail & Related papers (2021-02-08T11:29:57Z) - SOSD-Net: Joint Semantic Object Segmentation and Depth Estimation from
Monocular images [94.36401543589523]
We introduce the concept of semantic objectness to exploit the geometric relationship of these two tasks.
We then propose a Semantic Object and Depth Estimation Network (SOSD-Net) based on the objectness assumption.
To the best of our knowledge, SOSD-Net is the first network that exploits the geometry constraint for simultaneous monocular depth estimation and semantic segmentation.
arXiv Detail & Related papers (2021-01-19T02:41:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.