Related papers: Seeing is Believing: Brain-Inspired Modular Training for Mechanistic Interpretability

Seeing is Believing: Brain-Inspired Modular Training for Mechanistic Interpretability

URL: http://arxiv.org/abs/2305.08746v3
Date: Tue, 6 Jun 2023 16:11:42 GMT
Title: Seeing is Believing: Brain-Inspired Modular Training for Mechanistic Interpretability
Authors: Ziming Liu, Eric Gan, Max Tegmark
Abstract summary: Brain-Inspired Modular Training is a method for making neural networks more modular and interpretable. BIMT embeds neurons in a geometric space and augments the loss function with a cost proportional to the length of each neuron connection.
Score: 5.15188009671301
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We introduce Brain-Inspired Modular Training (BIMT), a method for making neural networks more modular and interpretable. Inspired by brains, BIMT embeds neurons in a geometric space and augments the loss function with a cost proportional to the length of each neuron connection. We demonstrate that BIMT discovers useful modular neural networks for many simple tasks, revealing compositional structures in symbolic formulas, interpretable decision boundaries and features for classification, and mathematical structure in algorithmic datasets. The ability to directly see modules with the naked eye can complement current mechanistic interpretability strategies such as probes, interventions or staring at all weights.

Related papers

Discovering Chunks in Neural Embeddings for Interpretability [53.80157905839065]
We propose leveraging the principle of chunking to interpret artificial neural population activities. We first demonstrate this concept in recurrent neural networks (RNNs) trained on artificial sequences with imposed regularities. We identify similar recurring embedding states corresponding to concepts in the input, with perturbations to these states activating or inhibiting the associated concepts.
arXiv Detail & Related papers (2025-02-03T20:30:46Z)
BrainMAP: Learning Multiple Activation Pathways in Brain Networks [77.15180533984947]
We introduce a novel framework BrainMAP to learn Multiple Activation Pathways in Brain networks. Our framework enables explanatory analyses of crucial brain regions involved in tasks.
arXiv Detail & Related papers (2024-12-23T09:13:35Z)
Brain-like Functional Organization within Large Language Models [58.93629121400745]
The human brain has long inspired the pursuit of artificial intelligence (AI) Recent neuroimaging studies provide compelling evidence of alignment between the computational representation of artificial neural networks (ANNs) and the neural responses of the human brain to stimuli. In this study, we bridge this gap by directly coupling sub-groups of artificial neurons with functional brain networks (FBNs) This framework links the AN sub-groups to FBNs, enabling the delineation of brain-like functional organization within large language models (LLMs)
arXiv Detail & Related papers (2024-10-25T13:15:17Z)
Unsupervised representation learning with Hebbian synaptic and structural plasticity in brain-like feedforward neural networks [0.0]
We introduce and evaluate a brain-like neural network model capable of unsupervised representation learning. The model was tested on a diverse set of popular machine learning benchmarks.
arXiv Detail & Related papers (2024-06-07T08:32:30Z)
Interpretable Spatio-Temporal Embedding for Brain Structural-Effective Network with Ordinary Differential Equation [56.34634121544929]
In this study, we first construct the brain-effective network via the dynamic causal model. We then introduce an interpretable graph learning framework termed Spatio-Temporal Embedding ODE (STE-ODE) This framework incorporates specifically designed directed node embedding layers, aiming at capturing the dynamic interplay between structural and effective networks.
arXiv Detail & Related papers (2024-05-21T20:37:07Z)
Randomly Weighted Neuromodulation in Neural Networks Facilitates Learning of Manifolds Common Across Tasks [1.9580473532948401]
Geometric Sensitive Hashing functions are neural network models that learn class-specific manifold geometry in supervised learning. We show that a randomly weighted neural network with a neuromodulation system can realize this function.
arXiv Detail & Related papers (2023-11-17T15:22:59Z)
OC-NMN: Object-centric Compositional Neural Module Network for Generative Visual Analogical Reasoning [49.12350554270196]
We show how modularity can be leveraged to derive a compositional data augmentation framework inspired by imagination. Our method, denoted Object-centric Compositional Neural Module Network (OC-NMN), decomposes visual generative reasoning tasks into a series of primitives applied to objects without using a domain-specific language.
arXiv Detail & Related papers (2023-10-28T20:12:58Z)
Growing Brains: Co-emergence of Anatomical and Functional Modularity in Recurrent Neural Networks [18.375521792153112]
Recurrent neural networks (RNNs) trained on compositional tasks can exhibit functional modularity. We apply a recent machine learning method, brain-inspired modular training, to a network being trained to solve a set of compositional cognitive tasks. We find that functional and anatomical clustering emerge together, such that functionally similar neurons also become spatially localized and interconnected.
arXiv Detail & Related papers (2023-10-11T17:58:25Z)
Emergent Modularity in Pre-trained Transformers [127.08792763817496]
We consider two main characteristics of modularity: functional specialization of neurons and function-based neuron grouping. We study how modularity emerges during pre-training, and find that the modular structure is stabilized at the early stage. It suggests that Transformers first construct the modular structure and then learn fine-grained neuron functions.
arXiv Detail & Related papers (2023-05-28T11:02:32Z)
Transformer-Based Hierarchical Clustering for Brain Network Analysis [13.239896897835191]
We propose a novel interpretable transformer-based model for joint hierarchical cluster identification and brain network classification. With the help of hierarchical clustering, the model achieves increased accuracy and reduced runtime complexity while providing plausible insight into the functional organization of brain regions.
arXiv Detail & Related papers (2023-05-06T22:14:13Z)
Functional2Structural: Cross-Modality Brain Networks Representation Learning [55.24969686433101]
Graph mining on brain networks may facilitate the discovery of novel biomarkers for clinical phenotypes and neurodegenerative diseases. We propose a novel graph learning framework, known as Deep Signed Brain Networks (DSBN), with a signed graph encoder. We validate our framework on clinical phenotype and neurodegenerative disease prediction tasks using two independent, publicly available datasets.
arXiv Detail & Related papers (2022-05-06T03:45:36Z)
Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules. inputs to the model are routed through a sequence of functions in a way that is end-to-end learned. We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.