Related papers: A Neuroscience-Inspired Dual-Process Model of Compositional Generalization

A Neuroscience-Inspired Dual-Process Model of Compositional Generalization

URL: http://arxiv.org/abs/2507.18868v1
Date: Fri, 25 Jul 2025 01:02:07 GMT
Title: A Neuroscience-Inspired Dual-Process Model of Compositional Generalization
Authors: Alex Noviello, Claas Beger, Jacob Groner, Kevin Ellis, Weinan Sun,
Abstract summary: We present MIRAGE, a framework that achieves systematic generalization on compositional tasks.<n>MIRAGE has two interacting modules mirroring the brain's deliberative HPC-PFC loop and intuitive neocortical pattern recognition.<n>This approach demonstrates systematic compositional generalization on the SCAN benchmark, achieving > 99% accuracy on all task splits with only 1.19M parameters in the transformer module.
Score: 4.575444193827658
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Systematic compositional generalization - constructing and understanding novel combinations of known building blocks - remains a core challenge for AI systems. Human cognition achieves this flexibility via the interplay of the hippocampus (HPC) and prefrontal cortex (PFC): the hippocampus rapidly encodes episodes, and the prefrontal cortex consolidates them into reusable schemas for reasoning. Drawing on these insights, we present MIRAGE (Meta-Inference with Rules and Abstractions from Generalized Experience), a framework that achieves systematic generalization on compositional tasks. MIRAGE has two interacting modules mirroring the brain's deliberative HPC-PFC loop and intuitive neocortical pattern recognition. (1) The meta-trained Transformer Neural Decomposer, paralleling neocortical "System 1" computation, is trained on a task-agnostic stream of randomly sampled compositional grammars and applies one decomposition step per pass, with successive passes iteratively refining the sequence representation. (2) The Schema Engine, analogous to the HPC-PFC "System 2" loop, dynamically extracts, ranks, and applies reusable schemas, storing variable bindings in episodic memory and expanding them when needed. By explicitly equipping the Transformer component of MIRAGE with actively managed schematic structures, our model performs systematic compositional operations through explicit schema application and transformation, relying solely on frozen weights when solving entirely novel tasks. This approach demonstrates systematic compositional generalization on the SCAN benchmark, achieving > 99% accuracy on all task splits with only 1.19M parameters in the transformer module. Ablation studies confirm that MIRAGE's systematicity critically depends on the quality of extracted schemas and the model's iterative refinement process.

Related papers

Activation-Guided Consensus Merging for Large Language Models [25.68958388022476]
We present textbfActivation-Guided textbfConsensus textbfMerging (textbfACM), a plug-and-play merging framework that determines layer-specific merging coefficients.<n>Experiments on Long-to-Short (L2S) and general merging tasks demonstrate that ACM consistently outperforms all baseline methods.
arXiv Detail & Related papers (2025-05-20T07:04:01Z)
On a Matrix Ensemble for Arbitrary Complex Quantum Systems [0.0]
We present a variation of the eigenvector ensemble initially proposed by Deutsch for the foundations of the Eigenstate Thermalization Hypothesis (ETH)<n>This ensemble incorporates additional system-dependent information, enabling the study of complex quantum systems beyond the universal predictions of Random Matrix Theory (RMT)<n>We show that for small energy windows, the correlation functions defined by this ensemble reduce to the predictions made by the ETH.
arXiv Detail & Related papers (2024-07-29T23:17:45Z)
End-to-End Meta-Bayesian Optimisation with Transformer Neural Processes [52.818579746354665]
This paper proposes the first end-to-end differentiable meta-BO framework that generalises neural processes to learn acquisition functions via transformer architectures. We enable this end-to-end framework with reinforcement learning (RL) to tackle the lack of labelled acquisition data.
arXiv Detail & Related papers (2023-05-25T10:58:46Z)
Formal Controller Synthesis for Markov Jump Linear Systems with Uncertain Dynamics [64.72260320446158]
We propose a method for synthesising controllers for Markov jump linear systems. Our method is based on a finite-state abstraction that captures both the discrete (mode-jumping) and continuous (stochastic linear) behaviour of the MJLS. We apply our method to multiple realistic benchmark problems, in particular, a temperature control and an aerial vehicle delivery problem.
arXiv Detail & Related papers (2022-12-01T17:36:30Z)
Physics Informed Machine Learning for Chemistry Tabulation [5.368509527675853]
We build on the base formulation and implementation ChemTab to include the dynamically generated Themochemical State Variables. We discuss the challenges in the implementation of this deep neural network architecture and experimentally demonstrate it's superior performance.
arXiv Detail & Related papers (2022-11-06T04:24:38Z)
Neural-Symbolic Recursive Machine for Systematic Generalization [113.22455566135757]
We introduce the Neural-Symbolic Recursive Machine (NSR), whose core is a Grounded Symbol System (GSS) NSR integrates neural perception, syntactic parsing, and semantic reasoning. We evaluate NSR's efficacy across four challenging benchmarks designed to probe systematic generalization capabilities.
arXiv Detail & Related papers (2022-10-04T13:27:38Z)
Compositional Generalization and Decomposition in Neural Program Synthesis [59.356261137313275]
In this paper, we focus on measuring the ability of learned program synthesizers to compositionally generalize. We first characterize several different axes along which program synthesis methods would be desired to generalize. We introduce a benchmark suite of tasks to assess these abilities based on two popular existing datasets.
arXiv Detail & Related papers (2022-04-07T22:16:05Z)
ChemTab: A Physics Guided Chemistry Modeling Framework [5.368509527675853]
We show that joint learning of the progress variables and the look-up model, can yield more accurate results. We propose a deep neural network architecture, called ChemTab, customized for the joint learning task.
arXiv Detail & Related papers (2022-02-20T16:21:13Z)
CSformer: Bridging Convolution and Transformer for Compressive Sensing [65.22377493627687]
This paper proposes a hybrid framework that integrates the advantages of leveraging detailed spatial information from CNN and the global context provided by transformer for enhanced representation learning. The proposed approach is an end-to-end compressive image sensing method, composed of adaptive sampling and recovery. The experimental results demonstrate the effectiveness of the dedicated transformer-based architecture for compressive sensing.
arXiv Detail & Related papers (2021-12-31T04:37:11Z)
Inducing Transformer's Compositional Generalization Ability via Auxiliary Sequence Prediction Tasks [86.10875837475783]
Systematic compositionality is an essential mechanism in human language, allowing the recombination of known parts to create novel expressions. Existing neural models have been shown to lack this basic ability in learning symbolic structures. We propose two auxiliary sequence prediction tasks that track the progress of function and argument semantics.
arXiv Detail & Related papers (2021-09-30T16:41:19Z)
X-volution: On the unification of convolution and self-attention [52.80459687846842]
We propose a multi-branch elementary module composed of both convolution and self-attention operation. The proposed X-volution achieves highly competitive visual understanding improvements.
arXiv Detail & Related papers (2021-06-04T04:32:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.