Seeing is Believing: Brain-Inspired Modular Training for Mechanistic
Interpretability
- URL: http://arxiv.org/abs/2305.08746v3
- Date: Tue, 6 Jun 2023 16:11:42 GMT
- Title: Seeing is Believing: Brain-Inspired Modular Training for Mechanistic
Interpretability
- Authors: Ziming Liu, Eric Gan, Max Tegmark
- Abstract summary: Brain-Inspired Modular Training is a method for making neural networks more modular and interpretable.
BIMT embeds neurons in a geometric space and augments the loss function with a cost proportional to the length of each neuron connection.
- Score: 5.15188009671301
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce Brain-Inspired Modular Training (BIMT), a method for making
neural networks more modular and interpretable. Inspired by brains, BIMT embeds
neurons in a geometric space and augments the loss function with a cost
proportional to the length of each neuron connection. We demonstrate that BIMT
discovers useful modular neural networks for many simple tasks, revealing
compositional structures in symbolic formulas, interpretable decision
boundaries and features for classification, and mathematical structure in
algorithmic datasets. The ability to directly see modules with the naked eye
can complement current mechanistic interpretability strategies such as probes,
interventions or staring at all weights.
Related papers
- Brain-like Functional Organization within Large Language Models [58.93629121400745]
The human brain has long inspired the pursuit of artificial intelligence (AI)
Recent neuroimaging studies provide compelling evidence of alignment between the computational representation of artificial neural networks (ANNs) and the neural responses of the human brain to stimuli.
In this study, we bridge this gap by directly coupling sub-groups of artificial neurons with functional brain networks (FBNs)
This framework links the AN sub-groups to FBNs, enabling the delineation of brain-like functional organization within large language models (LLMs)
arXiv Detail & Related papers (2024-10-25T13:15:17Z) - Unsupervised representation learning with Hebbian synaptic and structural plasticity in brain-like feedforward neural networks [0.0]
We introduce and evaluate a brain-like neural network model capable of unsupervised representation learning.
The model was tested on a diverse set of popular machine learning benchmarks.
arXiv Detail & Related papers (2024-06-07T08:32:30Z) - Interpretable Spatio-Temporal Embedding for Brain Structural-Effective Network with Ordinary Differential Equation [56.34634121544929]
In this study, we first construct the brain-effective network via the dynamic causal model.
We then introduce an interpretable graph learning framework termed Spatio-Temporal Embedding ODE (STE-ODE)
This framework incorporates specifically designed directed node embedding layers, aiming at capturing the dynamic interplay between structural and effective networks.
arXiv Detail & Related papers (2024-05-21T20:37:07Z) - Randomly Weighted Neuromodulation in Neural Networks Facilitates
Learning of Manifolds Common Across Tasks [1.9580473532948401]
Geometric Sensitive Hashing functions are neural network models that learn class-specific manifold geometry in supervised learning.
We show that a randomly weighted neural network with a neuromodulation system can realize this function.
arXiv Detail & Related papers (2023-11-17T15:22:59Z) - OC-NMN: Object-centric Compositional Neural Module Network for
Generative Visual Analogical Reasoning [49.12350554270196]
We show how modularity can be leveraged to derive a compositional data augmentation framework inspired by imagination.
Our method, denoted Object-centric Compositional Neural Module Network (OC-NMN), decomposes visual generative reasoning tasks into a series of primitives applied to objects without using a domain-specific language.
arXiv Detail & Related papers (2023-10-28T20:12:58Z) - Growing Brains: Co-emergence of Anatomical and Functional Modularity in
Recurrent Neural Networks [18.375521792153112]
Recurrent neural networks (RNNs) trained on compositional tasks can exhibit functional modularity.
We apply a recent machine learning method, brain-inspired modular training, to a network being trained to solve a set of compositional cognitive tasks.
We find that functional and anatomical clustering emerge together, such that functionally similar neurons also become spatially localized and interconnected.
arXiv Detail & Related papers (2023-10-11T17:58:25Z) - Emergent Modularity in Pre-trained Transformers [127.08792763817496]
We consider two main characteristics of modularity: functional specialization of neurons and function-based neuron grouping.
We study how modularity emerges during pre-training, and find that the modular structure is stabilized at the early stage.
It suggests that Transformers first construct the modular structure and then learn fine-grained neuron functions.
arXiv Detail & Related papers (2023-05-28T11:02:32Z) - Transformer-Based Hierarchical Clustering for Brain Network Analysis [13.239896897835191]
We propose a novel interpretable transformer-based model for joint hierarchical cluster identification and brain network classification.
With the help of hierarchical clustering, the model achieves increased accuracy and reduced runtime complexity while providing plausible insight into the functional organization of brain regions.
arXiv Detail & Related papers (2023-05-06T22:14:13Z) - Functional2Structural: Cross-Modality Brain Networks Representation
Learning [55.24969686433101]
Graph mining on brain networks may facilitate the discovery of novel biomarkers for clinical phenotypes and neurodegenerative diseases.
We propose a novel graph learning framework, known as Deep Signed Brain Networks (DSBN), with a signed graph encoder.
We validate our framework on clinical phenotype and neurodegenerative disease prediction tasks using two independent, publicly available datasets.
arXiv Detail & Related papers (2022-05-06T03:45:36Z) - Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules.
inputs to the model are routed through a sequence of functions in a way that is end-to-end learned.
We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.