Polysemanticity and Capacity in Neural Networks
- URL: http://arxiv.org/abs/2210.01892v3
- Date: Wed, 12 Jul 2023 01:02:19 GMT
- Title: Polysemanticity and Capacity in Neural Networks
- Authors: Adam Scherlis, Kshitij Sachan, Adam S. Jermyn, Joe Benton, Buck
Shlegeris
- Abstract summary: Individual neurons in neural networks often represent a mixture of unrelated features.
This phenomenon, called polysemanticity, can make interpreting neural networks more difficult.
- Score: 1.4174475093445233
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Individual neurons in neural networks often represent a mixture of unrelated
features. This phenomenon, called polysemanticity, can make interpreting neural
networks more difficult and so we aim to understand its causes. We propose
doing so through the lens of feature \emph{capacity}, which is the fractional
dimension each feature consumes in the embedding space. We show that in a toy
model the optimal capacity allocation tends to monosemantically represent the
most important features, polysemantically represent less important features (in
proportion to their impact on the loss), and entirely ignore the least
important features. Polysemanticity is more prevalent when the inputs have
higher kurtosis or sparsity and more prevalent in some architectures than
others. Given an optimal allocation of capacity, we go on to study the geometry
of the embedding space. We find a block-semi-orthogonal structure, with
differing block sizes in different models, highlighting the impact of model
architecture on the interpretability of its neurons.
Related papers
- Semantic Loss Functions for Neuro-Symbolic Structured Prediction [74.18322585177832]
We discuss the semantic loss, which injects knowledge about such structure, defined symbolically, into training.
It is agnostic to the arrangement of the symbols, and depends only on the semantics expressed thereby.
It can be combined with both discriminative and generative neural models.
arXiv Detail & Related papers (2024-05-12T22:18:25Z) - Towards Explaining Hypercomplex Neural Networks [6.543091030789653]
Hypercomplex neural networks are gaining increasing interest in the deep learning community.
In this paper, we propose inherently interpretable PHNNs and quaternion-like networks.
We draw insights into how this unique branch of neural models operates.
arXiv Detail & Related papers (2024-03-26T17:58:07Z) - Asymptotics of Learning with Deep Structured (Random) Features [9.366617422860543]
For a large class of feature maps we provide a tight characterisation of the test error associated with learning the readout layer.
In some cases our results can capture feature maps learned by deep, finite-width neural networks trained under gradient descent.
arXiv Detail & Related papers (2024-02-21T18:35:27Z) - What Causes Polysemanticity? An Alternative Origin Story of Mixed
Selectivity from Incidental Causes [14.623741848860037]
Polysemantic neurons -- neurons that activate for a set of unrelated features -- have been seen as a significant obstacle towards interpretability of task-optimized deep networks.
We show that polysemanticity can arise incidentally, even when there are ample neurons to represent all features in the data.
arXiv Detail & Related papers (2023-12-05T19:29:54Z) - Heterogeneous Feature Representation for Digital Twin-Oriented Complex
Networked Systems [13.28255056212425]
Building models of Complex Networked Systems that can accurately represent reality forms an important research area.
This study aims to improve the expressive power of node features in Digital Twin-Oriented Complex Networked Systems.
arXiv Detail & Related papers (2023-09-23T01:40:56Z) - Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules.
inputs to the model are routed through a sequence of functions in a way that is end-to-end learned.
We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z) - The Separation Capacity of Random Neural Networks [78.25060223808936]
We show that a sufficiently large two-layer ReLU-network with standard Gaussian weights and uniformly distributed biases can solve this problem with high probability.
We quantify the relevant structure of the data in terms of a novel notion of mutual complexity.
arXiv Detail & Related papers (2021-07-31T10:25:26Z) - The Causal Neural Connection: Expressiveness, Learnability, and
Inference [125.57815987218756]
An object called structural causal model (SCM) represents a collection of mechanisms and sources of random variation of the system under investigation.
In this paper, we show that the causal hierarchy theorem (Thm. 1, Bareinboim et al., 2020) still holds for neural models.
We introduce a special type of SCM called a neural causal model (NCM), and formalize a new type of inductive bias to encode structural constraints necessary for performing causal inferences.
arXiv Detail & Related papers (2021-07-02T01:55:18Z) - It's FLAN time! Summing feature-wise latent representations for
interpretability [0.0]
We propose a novel class of structurally-constrained neural networks, which we call FLANs (Feature-wise Latent Additive Networks)
FLANs process each input feature separately, computing for each of them a representation in a common latent space.
These feature-wise latent representations are then simply summed, and the aggregated representation is used for prediction.
arXiv Detail & Related papers (2021-06-18T12:19:33Z) - Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis.
By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner.
This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z) - Hyperbolic Neural Networks++ [66.16106727715061]
We generalize the fundamental components of neural networks in a single hyperbolic geometry model, namely, the Poincar'e ball model.
Experiments show the superior parameter efficiency of our methods compared to conventional hyperbolic components, and stability and outperformance over their Euclidean counterparts.
arXiv Detail & Related papers (2020-06-15T08:23:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.