Related papers: Discovering alternative solutions beyond the simplicity bias in recurrent neural networks

Discovering alternative solutions beyond the simplicity bias in recurrent neural networks

URL: http://arxiv.org/abs/2509.21504v1
Date: Thu, 25 Sep 2025 19:59:04 GMT
Title: Discovering alternative solutions beyond the simplicity bias in recurrent neural networks
Authors: William Qian, Cengiz Pehlevan,
Abstract summary: Training recurrent neural networks (RNNs) to perform neuroscience-style tasks has become a popular way to generate hypotheses for how neural circuits might perform computations.<n>Recent work has demonstrated that task-trained RNNs possess a strong simplicity bias.<n>We propose Iterative Neural Similarity Deflation to break this inductive bias.
Score: 36.12962884836429
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Training recurrent neural networks (RNNs) to perform neuroscience-style tasks has become a popular way to generate hypotheses for how neural circuits in the brain might perform computations. Recent work has demonstrated that task-trained RNNs possess a strong simplicity bias. In particular, this inductive bias often causes RNNs trained on the same task to collapse on effectively the same solution, typically comprised of fixed-point attractors or other low-dimensional dynamical motifs. While such solutions are readily interpretable, this collapse proves counterproductive for the sake of generating a set of genuinely unique hypotheses for how neural computations might be performed. Here we propose Iterative Neural Similarity Deflation (INSD), a simple method to break this inductive bias. By penalizing linear predictivity of neural activity produced by standard task-trained RNNs, we find an alternative class of solutions to classic neuroscience-style RNN tasks. These solutions appear distinct across a battery of analysis techniques, including representational similarity metrics, dynamical systems analysis, and the linear decodability of task-relevant variables. Moreover, these alternative solutions can sometimes achieve superior performance in difficult or out-of-distribution task regimes. Our findings underscore the importance of moving beyond the simplicity bias to uncover richer and more varied models of neural computation.

Related papers

Setting up for failure: automatic discovery of the neural mechanisms of cognitive errors [7.041349097212527]
We use a non-parametric generative model of behavioural responses to produce surrogate data for training RNNs.<n>To capture all relevant statistical aspects of the data, we developed a novel diffusion model-based approach for training RNNs.
arXiv Detail & Related papers (2025-12-04T14:00:32Z)
Measuring and Controlling Solution Degeneracy across Task-Trained Recurrent Neural Networks [3.049887057143419]
Task-trained recurrent neural networks (RNNs) are widely used in neuroscience and machine learning to model computations.<n>Different RNNs trained on the same task and achieving similar performance can exhibit strikingly different internal solutions known as solution degeneracy.<n>Here, we develop a unified framework to quantify and control solution degeneracy across three levels: behavior, neural dynamics, and weight space.
arXiv Detail & Related papers (2024-10-04T23:23:55Z)
Inferring stochastic low-rank recurrent neural networks from neural data [5.179844449042386]
A central aim in computational neuroscience is to relate the activity of large neurons to an underlying dynamical system.<n>Low-rank recurrent neural networks (RNNs) exhibit such interpretability by having tractable dynamics.<n>Here, we propose to fit low-rank RNNs with variational sequential Monte Carlo methods.
arXiv Detail & Related papers (2024-06-24T15:57:49Z)
Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters. Our approach enables a single model to encode neural computational graphs with diverse architectures. We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z)
Toward stochastic neural computing [11.955322183964201]
We propose a theory of neural computing in which streams of noisy inputs are transformed and processed through populations of spiking neurons. We demonstrate the application of our method to Intel's Loihi neuromorphic hardware.
arXiv Detail & Related papers (2023-05-23T12:05:35Z)
Permutation Equivariant Neural Functionals [92.0667671999604]
This work studies the design of neural networks that can process the weights or gradients of other neural networks. We focus on the permutation symmetries that arise in the weights of deep feedforward networks because hidden layer neurons have no inherent order. In our experiments, we find that permutation equivariant neural functionals are effective on a diverse set of tasks.
arXiv Detail & Related papers (2023-02-27T18:52:38Z)
Spiking neural network for nonlinear regression [68.8204255655161]
Spiking neural networks carry the potential for a massive reduction in memory and energy consumption. They introduce temporal and neuronal sparsity, which can be exploited by next-generation neuromorphic hardware. A framework for regression using spiking neural networks is proposed.
arXiv Detail & Related papers (2022-10-06T13:04:45Z)
Training Feedback Spiking Neural Networks by Implicit Differentiation on the Equilibrium State [66.2457134675891]
Spiking neural networks (SNNs) are brain-inspired models that enable energy-efficient implementation on neuromorphic hardware. Most existing methods imitate the backpropagation framework and feedforward architectures for artificial neural networks. We propose a novel training method that does not rely on the exact reverse of the forward computation.
arXiv Detail & Related papers (2021-09-29T07:46:54Z)
Dynamic Neural Diversification: Path to Computationally Sustainable Neural Networks [68.8204255655161]
Small neural networks with a constrained number of trainable parameters, can be suitable resource-efficient candidates for many simple tasks. We explore the diversity of the neurons within the hidden layer during the learning process. We analyze how the diversity of the neurons affects predictions of the model.
arXiv Detail & Related papers (2021-09-20T15:12:16Z)
Path classification by stochastic linear recurrent neural networks [2.5499055723658097]
We show that RNNs retain a partial signature of the paths they are fed as the unique information exploited for training and classification tasks. We argue that these RNNs are easy to train and robust and back these observations with numerical experiments on both synthetic and real data.
arXiv Detail & Related papers (2021-08-06T12:59:12Z)
Recurrent Neural Network Learning of Performance and Intrinsic Population Dynamics from Sparse Neural Data [77.92736596690297]
We introduce a novel training strategy that allows learning not only the input-output behavior of an RNN but also its internal network dynamics. We test the proposed method by training an RNN to simultaneously reproduce internal dynamics and output signals of a physiologically-inspired neural model. Remarkably, we show that the reproduction of the internal dynamics is successful even when the training algorithm relies on the activities of a small subset of neurons.
arXiv Detail & Related papers (2020-05-05T14:16:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.