Related papers: A Neuroscience-Inspired Dual-Process Model of Compositional Generalization

A Neuroscience-Inspired Dual-Process Model of Compositional Generalization

URL: http://arxiv.org/abs/2507.18868v3
Date: Tue, 28 Oct 2025 02:48:15 GMT
Title: A Neuroscience-Inspired Dual-Process Model of Compositional Generalization
Authors: Alex Noviello, Claas Beger, Jacob Groner, Kevin Ellis, Weinan Sun,
Abstract summary: We propose textscMirage, a neuro-inspired dual-process model.<n>It combines a fast, intuitive System1'' (a meta-trained Transformer) with a deliberate, rule-based System2'' (a Engine)<n>Mirage achieves $>$99% accuracy on all splits of the SCAN benchmark in a task-agnostic setting.
Score: 12.494200165412186
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Deep learning models struggle with systematic compositional generalization, a hallmark of human cognition. We propose \textsc{Mirage}, a neuro-inspired dual-process model that offers a processing account for this ability. It combines a fast, intuitive ``System~1'' (a meta-trained Transformer) with a deliberate, rule-based ``System~2'' (a Schema Engine), mirroring the brain's neocortical and hippocampal--prefrontal circuits. Trained to perform general, single-step decomposition on a stream of random grammars, Mirage achieves $>$99\% accuracy on all splits of the SCAN benchmark in a task-agnostic setting. Ablations confirm that the model's systematic behavior emerges from the architectural interplay of its two systems, particularly its use of explicit, prioritized schemas and iterative refinement. In line with recent progress on recursive/recurrent Transformer approaches, Mirage preserves an iterative neural update while externalizing declarative control into an interpretable schema module. Our work provides a concrete computational model for interpreting how compositional reasoning can arise from a modular cognitive architecture.

Related papers

A Unified Cortical Circuit Model with Divisive Normalization and Self-Excitation for Robust Representation and Memory Maintenance [2.705743343600661]
We introduce a recurrent neural circuit that combines divisive normalization with self-excitation to achieve robust encoding.<n>We demonstrate the model's versatility in two canonical tasks.<n>This work establishes a unified mathematical framework that bridges noise suppression, working memory, and approximate Bayesian inference.
arXiv Detail & Related papers (2025-08-18T08:00:24Z)
Learning neuro-symbolic convergent term rewriting systems [47.129504708849446]
We introduce a general framework for learning convergent term rewriting systems using a neuro-symbolic architecture inspired by the rewriting algorithm itself.<n>We present two modular implementations of such architecture: the Neural Rewriting System (NRS) and the Fast Neural Rewriting System (FastNRS)
arXiv Detail & Related papers (2025-07-25T15:24:56Z)
Activation-Guided Consensus Merging for Large Language Models [25.68958388022476]
We present textbfActivation-Guided textbfConsensus textbfMerging (textbfACM), a plug-and-play merging framework that determines layer-specific merging coefficients.<n>Experiments on Long-to-Short (L2S) and general merging tasks demonstrate that ACM consistently outperforms all baseline methods.
arXiv Detail & Related papers (2025-05-20T07:04:01Z)
On a Matrix Ensemble for Arbitrary Complex Quantum Systems [0.0]
We present a variation of the eigenvector ensemble initially proposed by Deutsch for the foundations of the Eigenstate Thermalization Hypothesis (ETH)<n>This ensemble incorporates additional system-dependent information, enabling the study of complex quantum systems beyond the universal predictions of Random Matrix Theory (RMT)<n>We show that for small energy windows, the correlation functions defined by this ensemble reduce to the predictions made by the ETH.
arXiv Detail & Related papers (2024-07-29T23:17:45Z)
From system models to class models: An in-context learning paradigm [0.0]
We introduce a novel paradigm for system identification, addressing two primary tasks: one-step-ahead prediction and multi-step simulation. We learn a meta model that represents a class of dynamical systems. For one-step prediction, a GPT-like decoder-only architecture is utilized, whereas the simulation problem employs an encoder-decoder structure.
arXiv Detail & Related papers (2023-08-25T13:50:17Z)
End-to-End Meta-Bayesian Optimisation with Transformer Neural Processes [52.818579746354665]
This paper proposes the first end-to-end differentiable meta-BO framework that generalises neural processes to learn acquisition functions via transformer architectures. We enable this end-to-end framework with reinforcement learning (RL) to tackle the lack of labelled acquisition data.
arXiv Detail & Related papers (2023-05-25T10:58:46Z)
Formal Controller Synthesis for Markov Jump Linear Systems with Uncertain Dynamics [64.72260320446158]
We propose a method for synthesising controllers for Markov jump linear systems. Our method is based on a finite-state abstraction that captures both the discrete (mode-jumping) and continuous (stochastic linear) behaviour of the MJLS. We apply our method to multiple realistic benchmark problems, in particular, a temperature control and an aerial vehicle delivery problem.
arXiv Detail & Related papers (2022-12-01T17:36:30Z)
Physics Informed Machine Learning for Chemistry Tabulation [5.368509527675853]
We build on the base formulation and implementation ChemTab to include the dynamically generated Themochemical State Variables. We discuss the challenges in the implementation of this deep neural network architecture and experimentally demonstrate it's superior performance.
arXiv Detail & Related papers (2022-11-06T04:24:38Z)
Neural-Symbolic Recursive Machine for Systematic Generalization [113.22455566135757]
We introduce the Neural-Symbolic Recursive Machine (NSR), whose core is a Grounded Symbol System (GSS) NSR integrates neural perception, syntactic parsing, and semantic reasoning. We evaluate NSR's efficacy across four challenging benchmarks designed to probe systematic generalization capabilities.
arXiv Detail & Related papers (2022-10-04T13:27:38Z)
Compositional Generalization and Decomposition in Neural Program Synthesis [59.356261137313275]
In this paper, we focus on measuring the ability of learned program synthesizers to compositionally generalize. We first characterize several different axes along which program synthesis methods would be desired to generalize. We introduce a benchmark suite of tasks to assess these abilities based on two popular existing datasets.
arXiv Detail & Related papers (2022-04-07T22:16:05Z)
ChemTab: A Physics Guided Chemistry Modeling Framework [5.368509527675853]
We show that joint learning of the progress variables and the look-up model, can yield more accurate results. We propose a deep neural network architecture, called ChemTab, customized for the joint learning task.
arXiv Detail & Related papers (2022-02-20T16:21:13Z)
CSformer: Bridging Convolution and Transformer for Compressive Sensing [65.22377493627687]
This paper proposes a hybrid framework that integrates the advantages of leveraging detailed spatial information from CNN and the global context provided by transformer for enhanced representation learning. The proposed approach is an end-to-end compressive image sensing method, composed of adaptive sampling and recovery. The experimental results demonstrate the effectiveness of the dedicated transformer-based architecture for compressive sensing.
arXiv Detail & Related papers (2021-12-31T04:37:11Z)
Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules. inputs to the model are routed through a sequence of functions in a way that is end-to-end learned. We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z)
Inducing Transformer's Compositional Generalization Ability via Auxiliary Sequence Prediction Tasks [86.10875837475783]
Systematic compositionality is an essential mechanism in human language, allowing the recombination of known parts to create novel expressions. Existing neural models have been shown to lack this basic ability in learning symbolic structures. We propose two auxiliary sequence prediction tasks that track the progress of function and argument semantics.
arXiv Detail & Related papers (2021-09-30T16:41:19Z)
Improving Coherence and Consistency in Neural Sequence Models with Dual-System, Neuro-Symbolic Reasoning [49.6928533575956]
We use neural inference to mediate between the neural System 1 and the logical System 2. Results in robust story generation and grounded instruction-following show that this approach can increase the coherence and accuracy of neurally-based generations.
arXiv Detail & Related papers (2021-07-06T17:59:49Z)
X-volution: On the unification of convolution and self-attention [52.80459687846842]
We propose a multi-branch elementary module composed of both convolution and self-attention operation. The proposed X-volution achieves highly competitive visual understanding improvements.
arXiv Detail & Related papers (2021-06-04T04:32:02Z)
Compositional Generalization by Learning Analytical Expressions [87.15737632096378]
A memory-augmented neural model is connected with analytical expressions to achieve compositional generalization. Experiments on the well-known benchmark SCAN demonstrate that our model seizes a great ability of compositional generalization.
arXiv Detail & Related papers (2020-06-18T15:50:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.