Mixture of Cognitive Reasoners: Modular Reasoning with Brain-Like Specialization
- URL: http://arxiv.org/abs/2506.13331v1
- Date: Mon, 16 Jun 2025 10:21:54 GMT
- Title: Mixture of Cognitive Reasoners: Modular Reasoning with Brain-Like Specialization
- Authors: Badr AlKhamissi, C. Nicolò De Sabbata, Zeming Chen, Martin Schrimpf, Antoine Bosselut,
- Abstract summary: We introduce the Mixture of Cognitive Reasoners (MiCRo) architecture and training paradigm.<n>We partition a pretrained transformer model into four expert modules, each corresponding to a well-studied cognitive brain network.<n>Our findings suggest that biologically inspired inductive biases involved in human cognition lead to significant modeling gains in interpretability, performance, and controllability.
- Score: 20.89486683564097
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Human intelligence emerges from the interaction of specialized brain networks, each dedicated to distinct cognitive functions such as language processing, logical reasoning, social understanding, and memory retrieval. Inspired by this biological observation, we introduce the Mixture of Cognitive Reasoners (MiCRo) architecture and training paradigm: a modular transformer-based language model with a training curriculum that encourages the emergence of functional specialization among different modules. Inspired by studies in neuroscience, we partition the layers of a pretrained transformer model into four expert modules, each corresponding to a well-studied cognitive brain network. Our Brain-Like model has three key benefits over the state of the art: First, the specialized experts are highly interpretable and functionally critical, where removing a module significantly impairs performance on domain-relevant benchmarks. Second, our model outperforms comparable baselines that lack specialization on seven reasoning benchmarks. And third, the model's behavior can be steered at inference time by selectively emphasizing certain expert modules (e.g., favoring social over logical reasoning), enabling fine-grained control over the style of its response. Our findings suggest that biologically inspired inductive biases involved in human cognition lead to significant modeling gains in interpretability, performance, and controllability.
Related papers
- Thinking Beyond Tokens: From Brain-Inspired Intelligence to Cognitive Foundations for Artificial General Intelligence and its Societal Impact [27.722167796617114]
This paper offers a cross-disciplinary synthesis of artificial intelligence, cognitive neuroscience, psychology, generative models, and agent-based systems.<n>We analyze the architectural and cognitive foundations of general intelligence, highlighting the role of modular reasoning, persistent memory, and multi-agent coordination.<n>We identify key scientific, technical, and ethical challenges on the path to Artificial General Intelligence.
arXiv Detail & Related papers (2025-07-01T16:52:25Z) - Neural Dynamics Model of Visual Decision-Making: Learning from Human Experts [28.340344705437758]
We implement a comprehensive visual decision-making model that spans from visual input to behavioral output.
Our model aligns closely with human behavior and reflects neural activities in primates.
A neuroimaging-informed fine-tuning approach was introduced and applied to the model, leading to performance improvements.
arXiv Detail & Related papers (2024-09-04T02:38:52Z) - Brain-Like Language Processing via a Shallow Untrained Multihead Attention Network [16.317199232071232]
Large Language Models (LLMs) have been shown to be effective models of the human language system.
In this work, we investigate the key architectural components driving the surprising alignment of untrained models.
arXiv Detail & Related papers (2024-06-21T12:54:03Z) - Interpretable Spatio-Temporal Embedding for Brain Structural-Effective Network with Ordinary Differential Equation [56.34634121544929]
In this study, we first construct the brain-effective network via the dynamic causal model.
We then introduce an interpretable graph learning framework termed Spatio-Temporal Embedding ODE (STE-ODE)
This framework incorporates specifically designed directed node embedding layers, aiming at capturing the dynamic interplay between structural and effective networks.
arXiv Detail & Related papers (2024-05-21T20:37:07Z) - MindBridge: A Cross-Subject Brain Decoding Framework [60.58552697067837]
Brain decoding aims to reconstruct stimuli from acquired brain signals.
Currently, brain decoding is confined to a per-subject-per-model paradigm.
We present MindBridge, that achieves cross-subject brain decoding by employing only one model.
arXiv Detail & Related papers (2024-04-11T15:46:42Z) - Emergent Modularity in Pre-trained Transformers [127.08792763817496]
We consider two main characteristics of modularity: functional specialization of neurons and function-based neuron grouping.
We study how modularity emerges during pre-training, and find that the modular structure is stabilized at the early stage.
It suggests that Transformers first construct the modular structure and then learn fine-grained neuron functions.
arXiv Detail & Related papers (2023-05-28T11:02:32Z) - Disentangling Reasoning Capabilities from Language Models with
Compositional Reasoning Transformers [72.04044221898059]
ReasonFormer is a unified reasoning framework for mirroring the modular and compositional reasoning process of humans.
The representation module (automatic thinking) and reasoning modules (controlled thinking) are disentangled to capture different levels of cognition.
The unified reasoning framework solves multiple tasks with a single model,and is trained and inferred in an end-to-end manner.
arXiv Detail & Related papers (2022-10-20T13:39:55Z) - Model-based analysis of brain activity reveals the hierarchy of language
in 305 subjects [82.81964713263483]
A popular approach to decompose the neural bases of language consists in correlating, across individuals, the brain responses to different stimuli.
Here, we show that a model-based approach can reach equivalent results within subjects exposed to natural stimuli.
arXiv Detail & Related papers (2021-10-12T15:30:21Z) - Meta-brain Models: biologically-inspired cognitive agents [0.0]
We propose a computational approach we call meta-brain models.
We will propose combinations of layers composed using specialized types of models.
We will conclude by proposing next steps in the development of this flexible and open-source approach.
arXiv Detail & Related papers (2021-08-31T05:20:53Z) - Compositional Generalization by Learning Analytical Expressions [87.15737632096378]
A memory-augmented neural model is connected with analytical expressions to achieve compositional generalization.
Experiments on the well-known benchmark SCAN demonstrate that our model seizes a great ability of compositional generalization.
arXiv Detail & Related papers (2020-06-18T15:50:57Z) - Brain-inspired self-organization with cellular neuromorphic computing
for multimodal unsupervised learning [0.0]
We propose a brain-inspired neural system based on the reentry theory using Self-Organizing Maps and Hebbian-like learning.
We show the gain of the so-called hardware plasticity induced by the ReSOM, where the system's topology is not fixed by the user but learned along the system's experience through self-organization.
arXiv Detail & Related papers (2020-04-11T21:02:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.