Related papers: Representation biases: will we achieve complete understanding by analyzing representations?

Representation biases: will we achieve complete understanding by analyzing representations?

URL: http://arxiv.org/abs/2507.22216v1
Date: Tue, 29 Jul 2025 20:25:09 GMT
Title: Representation biases: will we achieve complete understanding by analyzing representations?
Authors: Andrew Kyle Lampinen, Stephanie C. Y. Chan, Yuxuan Li, Katherine Hermann,
Abstract summary: Recent work in machine learning shows that learned feature representations may be biased to over-represent certain features.<n>These biases could pose challenges for achieving full understanding of a system through representational analysis.
Score: 8.158699242992691
License: http://creativecommons.org/licenses/by/4.0/
Abstract: A common approach in neuroscience is to study neural representations as a means to understand a system -- increasingly, by relating the neural representations to the internal representations learned by computational models. However, a recent work in machine learning (Lampinen, 2024) shows that learned feature representations may be biased to over-represent certain features, and represent others more weakly and less-consistently. For example, simple (linear) features may be more strongly and more consistently represented than complex (highly nonlinear) features. These biases could pose challenges for achieving full understanding of a system through representational analysis. In this perspective, we illustrate these challenges -- showing how feature representation biases can lead to strongly biased inferences from common analyses like PCA, regression, and RSA. We also present homomorphic encryption as a simple case study of the potential for strong dissociation between patterns of representation and computation. We discuss the implications of these results for representational comparisons between systems, and for neuroscience more generally.

Related papers

Questioning Representational Optimism in Deep Learning: The Fractured Entangled Representation Hypothesis [14.275283048655268]
We compare neural networks evolved through an open-ended search process to networks trained via conventional gradient descent.<n>While both networks produce the same output behavior, their internal representations differ dramatically.<n>In large models, FER may be degrading core model capacities like generalization, creativity, and (continual) learning.
arXiv Detail & Related papers (2025-05-16T16:28:34Z)
Neuro-Symbolic AI: Explainability, Challenges, and Future Trends [26.656105779121308]
This article proposes a classification for explainability by considering both model design and behavior of 191 studies from 2013. We classify them into five categories by considering whether the form of bridging the representation differences is readable. We put forward suggestions for future research in three aspects: unified representations, enhancing model explainability, ethical considerations, and social impact.
arXiv Detail & Related papers (2024-11-07T02:54:35Z)
Local vs distributed representations: What is the right basis for interpretability? [19.50614357801837]
We show that features obtained from sparse distributed representations are easier to interpret by human observers. Our results highlight that distributed representations constitute a superior basis for interpretability.
arXiv Detail & Related papers (2024-11-06T15:34:57Z)
Learned feature representations are biased by complexity, learning order, position, and more [4.529707672004383]
We explore surprising dissociations between representation and computation. We train various deep learning architectures to compute multiple abstract features about their inputs. We find that their learned feature representations are systematically biased towards representing some features more strongly than others.
arXiv Detail & Related papers (2024-05-09T15:34:15Z)
Specify Robust Causal Representation from Mixed Observations [35.387451486213344]
Learning representations purely from observations concerns the problem of learning a low-dimensional, compact representation which is beneficial to prediction models. We develop a learning method to learn such representation from observational data by regularizing the learning procedure with mutual information measures. We theoretically and empirically show that the models trained with the learned causal representations are more robust under adversarial attacks and distribution shifts.
arXiv Detail & Related papers (2023-10-21T02:18:35Z)
On the Complexity of Representation Learning in Contextual Linear Bandits [110.84649234726442]
We show that representation learning is fundamentally more complex than linear bandits. In particular, learning with a given set of representations is never simpler than learning with the worst realizable representation in the set.
arXiv Detail & Related papers (2022-12-19T13:08:58Z)
On Neural Architecture Inductive Biases for Relational Tasks [76.18938462270503]
We introduce a simple architecture based on similarity-distribution scores which we name Compositional Network generalization (CoRelNet) We find that simple architectural choices can outperform existing models in out-of-distribution generalizations.
arXiv Detail & Related papers (2022-06-09T16:24:01Z)
Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules. inputs to the model are routed through a sequence of functions in a way that is end-to-end learned. We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z)
Desiderata for Representation Learning: A Causal Perspective [104.3711759578494]
We take a causal perspective on representation learning, formalizing non-spuriousness and efficiency (in supervised representation learning) and disentanglement (in unsupervised representation learning) This yields computable metrics that can be used to assess the degree to which representations satisfy the desiderata of interest and learn non-spurious and disentangled representations from single observational datasets.
arXiv Detail & Related papers (2021-09-08T17:33:54Z)
Compositional Explanations of Neurons [52.71742655312625]
We describe a procedure for explaining neurons in deep representations by identifying compositional logical concepts. We use this procedure to answer several questions on interpretability in models for vision and natural language processing.
arXiv Detail & Related papers (2020-06-24T20:37:05Z)
Weakly-Supervised Disentanglement Without Compromises [53.55580957483103]
Intelligent agents should be able to learn useful representations by observing changes in their environment. We model such observations as pairs of non-i.i.d. images sharing at least one of the underlying factors of variation. We show that only knowing how many factors have changed, but not which ones, is sufficient to learn disentangled representations.
arXiv Detail & Related papers (2020-02-07T16:39:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.