Seeing is Believing: Brain-Inspired Modular Training for Mechanistic
Interpretability
- URL: http://arxiv.org/abs/2305.08746v3
- Date: Tue, 6 Jun 2023 16:11:42 GMT
- Title: Seeing is Believing: Brain-Inspired Modular Training for Mechanistic
Interpretability
- Authors: Ziming Liu, Eric Gan, Max Tegmark
- Abstract summary: Brain-Inspired Modular Training is a method for making neural networks more modular and interpretable.
BIMT embeds neurons in a geometric space and augments the loss function with a cost proportional to the length of each neuron connection.
- Score: 5.15188009671301
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce Brain-Inspired Modular Training (BIMT), a method for making
neural networks more modular and interpretable. Inspired by brains, BIMT embeds
neurons in a geometric space and augments the loss function with a cost
proportional to the length of each neuron connection. We demonstrate that BIMT
discovers useful modular neural networks for many simple tasks, revealing
compositional structures in symbolic formulas, interpretable decision
boundaries and features for classification, and mathematical structure in
algorithmic datasets. The ability to directly see modules with the naked eye
can complement current mechanistic interpretability strategies such as probes,
interventions or staring at all weights.
Related papers
- Discovering Chunks in Neural Embeddings for Interpretability [53.80157905839065]
We propose leveraging the principle of chunking to interpret artificial neural population activities.
We first demonstrate this concept in recurrent neural networks (RNNs) trained on artificial sequences with imposed regularities.
We identify similar recurring embedding states corresponding to concepts in the input, with perturbations to these states activating or inhibiting the associated concepts.
arXiv Detail & Related papers (2025-02-03T20:30:46Z) - BrainMAP: Learning Multiple Activation Pathways in Brain Networks [77.15180533984947]
We introduce a novel framework BrainMAP to learn Multiple Activation Pathways in Brain networks.
Our framework enables explanatory analyses of crucial brain regions involved in tasks.
arXiv Detail & Related papers (2024-12-23T09:13:35Z) - Unsupervised representation learning with Hebbian synaptic and structural plasticity in brain-like feedforward neural networks [0.0]
We introduce and evaluate a brain-like neural network model capable of unsupervised representation learning.
The model was tested on a diverse set of popular machine learning benchmarks.
arXiv Detail & Related papers (2024-06-07T08:32:30Z) - Randomly Weighted Neuromodulation in Neural Networks Facilitates
Learning of Manifolds Common Across Tasks [1.9580473532948401]
Geometric Sensitive Hashing functions are neural network models that learn class-specific manifold geometry in supervised learning.
We show that a randomly weighted neural network with a neuromodulation system can realize this function.
arXiv Detail & Related papers (2023-11-17T15:22:59Z) - OC-NMN: Object-centric Compositional Neural Module Network for
Generative Visual Analogical Reasoning [49.12350554270196]
We show how modularity can be leveraged to derive a compositional data augmentation framework inspired by imagination.
Our method, denoted Object-centric Compositional Neural Module Network (OC-NMN), decomposes visual generative reasoning tasks into a series of primitives applied to objects without using a domain-specific language.
arXiv Detail & Related papers (2023-10-28T20:12:58Z) - Growing Brains: Co-emergence of Anatomical and Functional Modularity in
Recurrent Neural Networks [18.375521792153112]
Recurrent neural networks (RNNs) trained on compositional tasks can exhibit functional modularity.
We apply a recent machine learning method, brain-inspired modular training, to a network being trained to solve a set of compositional cognitive tasks.
We find that functional and anatomical clustering emerge together, such that functionally similar neurons also become spatially localized and interconnected.
arXiv Detail & Related papers (2023-10-11T17:58:25Z) - Emergent Modularity in Pre-trained Transformers [127.08792763817496]
We consider two main characteristics of modularity: functional specialization of neurons and function-based neuron grouping.
We study how modularity emerges during pre-training, and find that the modular structure is stabilized at the early stage.
It suggests that Transformers first construct the modular structure and then learn fine-grained neuron functions.
arXiv Detail & Related papers (2023-05-28T11:02:32Z) - Transformer-Based Hierarchical Clustering for Brain Network Analysis [13.239896897835191]
We propose a novel interpretable transformer-based model for joint hierarchical cluster identification and brain network classification.
With the help of hierarchical clustering, the model achieves increased accuracy and reduced runtime complexity while providing plausible insight into the functional organization of brain regions.
arXiv Detail & Related papers (2023-05-06T22:14:13Z) - Functional2Structural: Cross-Modality Brain Networks Representation
Learning [55.24969686433101]
Graph mining on brain networks may facilitate the discovery of novel biomarkers for clinical phenotypes and neurodegenerative diseases.
We propose a novel graph learning framework, known as Deep Signed Brain Networks (DSBN), with a signed graph encoder.
We validate our framework on clinical phenotype and neurodegenerative disease prediction tasks using two independent, publicly available datasets.
arXiv Detail & Related papers (2022-05-06T03:45:36Z) - Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules.
inputs to the model are routed through a sequence of functions in a way that is end-to-end learned.
We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.