Related papers: Emergence of Quantised Representations Isolated to Anisotropic Functions

Emergence of Quantised Representations Isolated to Anisotropic Functions

URL: http://arxiv.org/abs/2507.12070v2
Date: Wed, 30 Jul 2025 09:07:28 GMT
Title: Emergence of Quantised Representations Isolated to Anisotropic Functions
Authors: George Bird,
Abstract summary: This paper builds upon the existing Spotlight Resonance method to determine representational alignment.<n>A new tool is used to gain insight into how discrete representations can emerge and organise in autoencoder models.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper presents a novel methodology for determining representational alignment, which builds upon the existing Spotlight Resonance method. Particularly, this new tool is used to gain insight into how discrete representations can emerge and organise in autoencoder models, through a controlled ablation study in which only the activation function is altered. Using this technique, the validity of whether function-driven symmetries can act as implicit inductive biases on representations is determined. Representations are found to tend to discretise when the activation functions are defined through a discrete algebraic permutation-equivariant symmetry. In contrast, they remain continuous under a continuous algebraic orthogonal-equivariant definition. This confirms the hypothesis: algebraic symmetries of network primitives can carry unintended inductive biases which produce task-independent artefactual structures in representations. The discrete symmetry of contemporary forms is shown to be a strong predictor for the induction of discrete representations transformed from otherwise continuous structures -- a quantisation effect. This motivates further reassessment of functional forms in common usage. Moreover, this supports a general causal model for one mode in which discrete representations may form, and could constitute a prerequisite for downstream interpretability phenomena, including grandmother neurons, discrete coding schemes, general linear features and possibly Superposition. Hence, this tool and proposed mechanism for the influence of functional form on representations may provide insights into emergent interpretability research. Finally, preliminary results indicate that quantisation of representations appears to correlate with a measurable increase in reconstruction error, reinforcing previous conjectures that this collapse can be detrimental.

Related papers

A Free Probabilistic Framework for Analyzing the Transformer-based Language Models [19.78896931593813]
We outline an operator-theoretic framework for analyzing transformer-based language models.<n>We reinterpret attention as a non-commutative convolution and view the layer-wise propagation of representations as an evolution governed by free additive convolution.
arXiv Detail & Related papers (2025-06-19T19:13:02Z)
Solving Inverse Problems in Stochastic Self-Organising Systems through Invariant Representations [12.394699094197545]
Self-organising systems demonstrate how simple local rules can generate complex patterns.<n>Many natural systems rely on such dynamics, making self-organisation central to understanding natural complexity.<n>A fundamental challenge in modelling such systems is solving the inverse problem: finding the unknown causal parameters from macroscopic observations.
arXiv Detail & Related papers (2025-06-13T14:01:39Z)
The Origins of Representation Manifolds in Large Language Models [52.68554895844062]
We show that cosine similarity in representation space may encode the intrinsic geometry of a feature through shortest, on-manifold paths.<n>The critical assumptions and predictions of the theory are validated on text embeddings and token activations of large language models.
arXiv Detail & Related papers (2025-05-23T13:31:22Z)
Interpreting Equivariant Representations [4.738231680800414]
In this paper, we demonstrate that the inductive bias imposed on the by an equivariant model must also be taken into account when using latent representations.<n>We show how not accounting for the inductive biases leads to decreased performance on downstream tasks, and vice versa.
arXiv Detail & Related papers (2024-01-23T09:43:30Z)
Nonparametric Partial Disentanglement via Mechanism Sparsity: Sparse Actions, Interventions and Sparse Temporal Dependencies [58.179981892921056]
This work introduces a novel principle for disentanglement we call mechanism sparsity regularization. We propose a representation learning method that induces disentanglement by simultaneously learning the latent factors. We show that the latent factors can be recovered by regularizing the learned causal graph to be sparse.
arXiv Detail & Related papers (2024-01-10T02:38:21Z)
Unifying Causal Inference and Reinforcement Learning using Higher-Order Category Theory [4.119151469153588]
We present a unified formalism for structure discovery of causal models and predictive state representation models in reinforcement learning. Specifically, we model structure discovery in both settings using simplicial objects.
arXiv Detail & Related papers (2022-09-13T19:04:18Z)
Identifying Weight-Variant Latent Causal Models [82.14087963690561]
We find that transitivity acts as a key role in impeding the identifiability of latent causal representations. Under some mild assumptions, we can show that the latent causal representations can be identified up to trivial permutation and scaling. We propose a novel method, termed Structural caUsAl Variational autoEncoder, which directly learns latent causal representations and causal relationships among them.
arXiv Detail & Related papers (2022-08-30T11:12:59Z)
Modeling Implicit Bias with Fuzzy Cognitive Maps [0.0]
This paper presents a Fuzzy Cognitive Map model to quantify implicit bias in structured datasets. We introduce a new reasoning mechanism equipped with a normalization-like transfer function that prevents neurons from saturating.
arXiv Detail & Related papers (2021-12-23T17:04:12Z)
Topographic VAEs learn Equivariant Capsules [84.33745072274942]
We introduce the Topographic VAE: a novel method for efficiently training deep generative models with topographically organized latent variables. We show that such a model indeed learns to organize its activations according to salient characteristics such as digit class, width, and style on MNIST. We demonstrate approximate equivariance to complex transformations, expanding upon the capabilities of existing group equivariant neural networks.
arXiv Detail & Related papers (2021-09-03T09:25:57Z)
Discovering Latent Causal Variables via Mechanism Sparsity: A New Principle for Nonlinear ICA [81.4991350761909]
Independent component analysis (ICA) refers to an ensemble of methods which formalize this goal and provide estimation procedure for practical application. We show that the latent variables can be recovered up to a permutation if one regularizes the latent mechanisms to be sparse.
arXiv Detail & Related papers (2021-07-21T14:22:14Z)
Linear Disentangled Representations and Unsupervised Action Estimation [2.793095554369282]
We show that linear disentangled representations are not generally present in standard VAE models. We propose a method to induce irreducible representations which forgoes the need for labelled action sequences.
arXiv Detail & Related papers (2020-08-18T13:23:57Z)
Learning Disentangled Representations with Latent Variation Predictability [102.4163768995288]
This paper defines the variation predictability of latent disentangled representations. Within an adversarial generation process, we encourage variation predictability by maximizing the mutual information between latent variations and corresponding image pairs. We develop an evaluation metric that does not rely on the ground-truth generative factors to measure the disentanglement of latent representations.
arXiv Detail & Related papers (2020-07-25T08:54:26Z)
Invariant Feature Coding using Tensor Product Representation [75.62232699377877]
We prove that the group-invariant feature vector contains sufficient discriminative information when learning a linear classifier. A novel feature model that explicitly consider group action is proposed for principal component analysis and k-means clustering.
arXiv Detail & Related papers (2019-06-05T07:15:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.