Revisiting the Platonic Representation Hypothesis: An Aristotelian View
- URL: http://arxiv.org/abs/2602.14486v1
- Date: Mon, 16 Feb 2026 06:01:23 GMT
- Title: Revisiting the Platonic Representation Hypothesis: An Aristotelian View
- Authors: Fabian Gröger, Shuo Wen, Maria Brbić,
- Abstract summary: We show that the existing metrics used to measure representational similarity are confounded by network scale.<n>We introduce a permutation-based null-calibration framework that transforms any representational similarity metric into a calibrated score with statistical guarantees.<n>We propose the Aristotelian Representation Hypothesis: representations in neural networks are converging to shared local neighborhood relationships.
- Score: 3.647057737530591
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Platonic Representation Hypothesis suggests that representations from neural networks are converging to a common statistical model of reality. We show that the existing metrics used to measure representational similarity are confounded by network scale: increasing model depth or width can systematically inflate representational similarity scores. To correct these effects, we introduce a permutation-based null-calibration framework that transforms any representational similarity metric into a calibrated score with statistical guarantees. We revisit the Platonic Representation Hypothesis with our calibration framework, which reveals a nuanced picture: the apparent convergence reported by global spectral measures largely disappears after calibration, while local neighborhood similarity, but not local distances, retains significant agreement across different modalities. Based on these findings, we propose the Aristotelian Representation Hypothesis: representations in neural networks are converging to shared local neighborhood relationships.
Related papers
- Barycentric alignment for instance-level comparison of neural representations [2.1920579994942164]
We introduce a barycentric alignment framework that quotients out nuisance symmetries to construct a universal embedding space across many models.<n>We identify systematic input properties that predict representational convergence versus divergence across vision and language model families.<n>We also apply the same barycentric alignment framework to purely unimodal vision and language models and find that post-hoc alignment into a shared space yields image text similarity scores.
arXiv Detail & Related papers (2026-02-09T21:49:44Z) - VIKING: Deep variational inference with stochastic projections [48.946143517489496]
Variational mean field approximations tend to struggle with contemporary overparametrized deep neural networks.<n>We propose a simple variational family that considers two independent linear subspaces of the parameter space.<n>This allows us to build a fully-correlated approximate posterior reflecting the overparametrization.
arXiv Detail & Related papers (2025-10-27T15:38:35Z) - When Does Closeness in Distribution Imply Representational Similarity? An Identifiability Perspective [9.578534178372829]
We prove that a small Kullback--Leibler divergence between the model distributions does not guarantee that the corresponding representations are similar.<n>We then define a distributional distance for which closeness implies representational similarity.<n>In synthetic experiments, we find that wider networks learn distributions which are closer with respect to our distance and have more similar representations.
arXiv Detail & Related papers (2025-06-04T09:44:22Z) - Continuous Representation Methods, Theories, and Applications: An Overview and Perspectives [55.22101595974193]
Recently, continuous representation methods emerge as novel paradigms that characterize the intrinsic structures of real-world data.<n>This review focuses on three aspects: (i) Continuous representation method designs such as basis function representation, statistical modeling, tensor function decomposition, and implicit neural representation; (ii) Theoretical foundations of continuous representations such as approximation error analysis, convergence property, and implicit regularization; and (iii) Real-world applications of continuous representations derived from computer vision, graphics, bioinformatics, and remote sensing.
arXiv Detail & Related papers (2025-05-21T07:50:19Z) - Discriminating image representations with principal distortions [13.823252055829661]
We propose a framework for comparing a set of image representations in terms of their local geometries.<n>We show how our framework can be used to probe for informative differences in local sensitivities between complex models.
arXiv Detail & Related papers (2024-10-20T16:04:37Z) - The Platonic Representation Hypothesis [35.16414255187554]
We argue that representations in AI models, particularly deep networks, are converging.
As vision models and language models get larger, they measure distance between datapoints in a more and more alike way.
We hypothesize that this convergence is driving toward a shared statistical model of reality, akin to Plato's concept of an ideal reality.
arXiv Detail & Related papers (2024-05-13T17:58:30Z) - Learning for Transductive Threshold Calibration in Open-World Recognition [83.35320675679122]
We introduce OpenGCN, a Graph Neural Network-based transductive threshold calibration method with enhanced robustness and adaptability.
Experiments across open-world visual recognition benchmarks validate OpenGCN's superiority over existing posthoc calibration methods for open-world threshold calibration.
arXiv Detail & Related papers (2023-05-19T23:52:48Z) - Counting Like Human: Anthropoid Crowd Counting on Modeling the
Similarity of Objects [92.80955339180119]
mainstream crowd counting methods regress density map and integrate it to obtain counting results.
Inspired by this, we propose a rational and anthropoid crowd counting framework.
arXiv Detail & Related papers (2022-12-02T07:00:53Z) - Adaptive Local-Component-aware Graph Convolutional Network for One-shot
Skeleton-based Action Recognition [54.23513799338309]
We present an Adaptive Local-Component-aware Graph Convolutional Network for skeleton-based action recognition.
Our method provides a stronger representation than the global embedding and helps our model reach state-of-the-art.
arXiv Detail & Related papers (2022-09-21T02:33:07Z) - Deconfounded Representation Similarity for Comparison of Neural Networks [16.23053104309891]
Similarity metrics are confounded by the population structure of data items in the input space.
We show that deconfounding the similarity metrics increases the resolution of detecting semantically similar neural networks.
arXiv Detail & Related papers (2022-01-31T21:25:02Z) - Image Synthesis via Semantic Composition [74.68191130898805]
We present a novel approach to synthesize realistic images based on their semantic layouts.
It hypothesizes that for objects with similar appearance, they share similar representation.
Our method establishes dependencies between regions according to their appearance correlation, yielding both spatially variant and associated representations.
arXiv Detail & Related papers (2021-09-15T02:26:07Z) - Learning Disentangled Representations with Latent Variation
Predictability [102.4163768995288]
This paper defines the variation predictability of latent disentangled representations.
Within an adversarial generation process, we encourage variation predictability by maximizing the mutual information between latent variations and corresponding image pairs.
We develop an evaluation metric that does not rely on the ground-truth generative factors to measure the disentanglement of latent representations.
arXiv Detail & Related papers (2020-07-25T08:54:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.