Related papers: Mechanistic Interpretability of RNNs emulating Hidden Markov Models

Mechanistic Interpretability of RNNs emulating Hidden Markov Models

URL: http://arxiv.org/abs/2510.25674v1
Date: Wed, 29 Oct 2025 16:42:07 GMT
Title: Mechanistic Interpretability of RNNs emulating Hidden Markov Models
Authors: Elia Torre, Michele Viscione, Lucas Pompe, Benjamin F Grewe, Valerio Mante,
Abstract summary: Recurrent neural networks (RNNs) provide a powerful approach in neuroscience to infer latent dynamics in neural populations.<n>We show that RNNs can replicate Hidden Markov Models emission statistics and then reverse-engineer the trained networks to uncover the mechanisms they implement.
Score: 2.786617687297761
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recurrent neural networks (RNNs) provide a powerful approach in neuroscience to infer latent dynamics in neural populations and to generate hypotheses about the neural computations underlying behavior. However, past work has focused on relatively simple, input-driven, and largely deterministic behaviors - little is known about the mechanisms that would allow RNNs to generate the richer, spontaneous, and potentially stochastic behaviors observed in natural settings. Modeling with Hidden Markov Models (HMMs) has revealed a segmentation of natural behaviors into discrete latent states with stochastic transitions between them, a type of dynamics that may appear at odds with the continuous state spaces implemented by RNNs. Here we first show that RNNs can replicate HMM emission statistics and then reverse-engineer the trained networks to uncover the mechanisms they implement. In the absence of inputs, the activity of trained RNNs collapses towards a single fixed point. When driven by stochastic input, trajectories instead exhibit noise-sustained dynamics along closed orbits. Rotation along these orbits modulates the emission probabilities and is governed by transitions between regions of slow, noise-driven dynamics connected by fast, deterministic transitions. The trained RNNs develop highly structured connectivity, with a small set of "kick neurons" initiating transitions between these regions. This mechanism emerges during training as the network shifts into a regime of stochastic resonance, enabling it to perform probabilistic computations. Analyses across multiple HMM architectures - fully connected, cyclic, and linear-chain - reveal that this solution generalizes through the modular reuse of the same dynamical motif, suggesting a compositional principle by which RNNs can emulate complex discrete latent dynamics.

Related papers

Drift-Diffusion Matching: Embedding dynamics in latent manifolds of asymmetric neural networks [0.8793721044482612]
We introduce a general framework for training continuous-time RNNs to represent arbitrary dynamical systems within a low-dimensional latent subspace.<n>We show that RNNs can faithfully embed the drift and diffusion of a given differential equation, including nonlinear and nonequilibrium dynamics such as chaotic attractors.<n>Our results extend attractor neural network theory beyond equilibrium, showing that asymmetric neural populations can implement a broad class of dynamical computations within low-dimensional, unifying ideas from associative memory, nonequilibrium statistical mechanics, and neural computation.
arXiv Detail & Related papers (2026-02-16T16:15:59Z)
Neuronal Group Communication for Efficient Neural representation [85.36421257648294]
This paper addresses the question of how to build large neural systems that learn efficient, modular, and interpretable representations.<n>We propose Neuronal Group Communication (NGC), a theory-driven framework that reimagines a neural network as a dynamical system of interacting neuronal groups.<n>NGC treats weights as transient interactions between embedding-like neuronal states, with neural computation unfolding through iterative communication among groups of neurons.
arXiv Detail & Related papers (2025-10-19T14:23:35Z)
Langevin Flows for Modeling Neural Latent Dynamics [81.81271685018284]
We introduce LangevinFlow, a sequential Variational Auto-Encoder where the time evolution of latent variables is governed by the underdamped Langevin equation.<n>Our approach incorporates physical priors -- such as inertia, damping, a learned potential function, and forces -- to represent both autonomous and non-autonomous processes in neural systems.<n>Our method outperforms state-of-the-art baselines on synthetic neural populations generated by a Lorenz attractor.
arXiv Detail & Related papers (2025-07-15T17:57:48Z)
Generative System Dynamics in Recurrent Neural Networks [56.958984970518564]
We investigate the continuous time dynamics of Recurrent Neural Networks (RNNs)<n>We show that skew-symmetric weight matrices are fundamental to enable stable limit cycles in both linear and nonlinear configurations.<n> Numerical simulations showcase how nonlinear activation functions not only maintain limit cycles, but also enhance the numerical stability of the system integration process.
arXiv Detail & Related papers (2025-04-16T10:39:43Z)
Inferring stochastic low-rank recurrent neural networks from neural data [5.179844449042386]
A central aim in computational neuroscience is to relate the activity of large neurons to an underlying dynamical system.<n>Low-rank recurrent neural networks (RNNs) exhibit such interpretability by having tractable dynamics.<n>Here, we propose to fit low-rank RNNs with variational sequential Monte Carlo methods.
arXiv Detail & Related papers (2024-06-24T15:57:49Z)
Episodic Memory Theory for the Mechanistic Interpretation of Recurrent Neural Networks [3.683202928838613]
We propose the Episodic Memory Theory (EMT), illustrating that RNNs can be conceptualized as discrete-time analogs of the recently proposed General Sequential Episodic Memory Model. We introduce a novel set of algorithmic tasks tailored to probe the variable binding behavior in RNNs. Our empirical investigations reveal that trained RNNs consistently converge to the variable binding circuit, thus indicating universality in the dynamics of RNNs.
arXiv Detail & Related papers (2023-10-03T20:52:37Z)
Capturing dynamical correlations using implicit neural representations [85.66456606776552]
We develop an artificial intelligence framework which combines a neural network trained to mimic simulated data from a model Hamiltonian with automatic differentiation to recover unknown parameters from experimental data. In doing so, we illustrate the ability to build and train a differentiable model only once, which then can be applied in real-time to multi-dimensional scattering data.
arXiv Detail & Related papers (2023-04-08T07:55:36Z)
Expressive architectures enhance interpretability of dynamics-based neural population models [2.294014185517203]
We evaluate the performance of sequential autoencoders (SAEs) in recovering latent chaotic attractors from simulated neural datasets. We found that SAEs with widely-used recurrent neural network (RNN)-based dynamics were unable to infer accurate firing rates at the true latent state dimensionality.
arXiv Detail & Related papers (2022-12-07T16:44:26Z)
Theory of gating in recurrent neural networks [5.672132510411465]
Recurrent neural networks (RNNs) are powerful dynamical models, widely used in machine learning (ML) and neuroscience. Here, we show that gating offers flexible control of two salient features of the collective dynamics. The gate controlling timescales leads to a novel, marginally stable state, where the network functions as a flexible integrator.
arXiv Detail & Related papers (2020-07-29T13:20:58Z)
Provably Efficient Neural Estimation of Structural Equation Model: An Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs) We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent. For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z)
Recurrent Neural Network Learning of Performance and Intrinsic Population Dynamics from Sparse Neural Data [77.92736596690297]
We introduce a novel training strategy that allows learning not only the input-output behavior of an RNN but also its internal network dynamics. We test the proposed method by training an RNN to simultaneously reproduce internal dynamics and output signals of a physiologically-inspired neural model. Remarkably, we show that the reproduction of the internal dynamics is successful even when the training algorithm relies on the activities of a small subset of neurons.
arXiv Detail & Related papers (2020-05-05T14:16:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.