Related papers: Weight-Space Linear Recurrent Neural Networks

Weight-Space Linear Recurrent Neural Networks

URL: http://arxiv.org/abs/2506.01153v1
Date: Sun, 01 Jun 2025 20:13:28 GMT
Title: Weight-Space Linear Recurrent Neural Networks
Authors: Roussel Desmond Nzoyem, Nawid Keshtmand, Idriss Tsayem, David A. W. Barton, Tom Deakin,
Abstract summary: WARP (Weight-space Adaptive Recurrent Prediction) is a powerful framework that unifies weight-space learning with linear recurrence.<n>We show that WARP matches or surpasses state-of-the-art baselines on diverse classification tasks.
Score: 0.5937476291232799
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We introduce WARP (Weight-space Adaptive Recurrent Prediction), a simple yet powerful framework that unifies weight-space learning with linear recurrence to redefine sequence modeling. Unlike conventional recurrent neural networks (RNNs) which collapse temporal dynamics into fixed-dimensional hidden states, WARP explicitly parametrizes the hidden state as the weights of a distinct root neural network. This formulation promotes higher-resolution memory, gradient-free adaptation at test-time, and seamless integration of domain-specific physical priors. Empirical validation shows that WARP matches or surpasses state-of-the-art baselines on diverse classification tasks, spanning synthetic benchmarks to real-world datasets. Furthermore, extensive experiments across sequential image completion, dynamical system reconstruction, and multivariate time series forecasting demonstrate its expressiveness and generalization capabilities. Critically, WARP's weight trajectories offer valuable insights into the model's inner workings. Ablation studies confirm the architectural necessity of key components, solidifying weight-space linear RNNs as a transformative paradigm for adaptive machine intelligence.

Related papers

T-SHRED: Symbolic Regression for Regularization and Model Discovery with Transformer Shallow Recurrent Decoders [2.8820361301109365]
SHallow REcurrent Decoders (SHRED) are effective for system identification and forecasting from sparse sensor measurements.<n>We improve SHRED by leveraging transformers (T-SHRED) for the temporal encoding which improves performance on next-step state prediction.<n> Symbolic regression improves model interpretability by learning and regularizing the dynamics of the latent space during training.
arXiv Detail & Related papers (2025-06-18T21:14:38Z)
Generalized Factor Neural Network Model for High-dimensional Regression [50.554377879576066]
We tackle the challenges of modeling high-dimensional data sets with latent low-dimensional structures hidden within complex, non-linear, and noisy relationships.<n>Our approach enables a seamless integration of concepts from non-parametric regression, factor models, and neural networks for high-dimensional regression.
arXiv Detail & Related papers (2025-02-16T23:13:55Z)
Conservation-informed Graph Learning for Spatiotemporal Dynamics Prediction [84.26340606752763]
In this paper, we introduce the conservation-informed GNN (CiGNN), an end-to-end explainable learning framework.<n>The network is designed to conform to the general symmetry conservation law via symmetry where conservative and non-conservative information passes over a multiscale space by a latent temporal marching strategy.<n>Results demonstrate that CiGNN exhibits remarkable baseline accuracy and generalizability, and is readily applicable to learning for prediction of varioustemporal dynamics.
arXiv Detail & Related papers (2024-12-30T13:55:59Z)
Recurrent Stochastic Configuration Networks with Hybrid Regularization for Nonlinear Dynamics Modelling [3.8719670789415925]
Recurrent configuration networks (RSCNs) have shown great potential in modelling nonlinear dynamic systems with uncertainties.<n>This paper presents an RSCN with hybrid regularization to enhance both the learning capacity and generalization performance of the network.
arXiv Detail & Related papers (2024-11-26T03:06:39Z)
Deep Recurrent Stochastic Configuration Networks for Modelling Nonlinear Dynamic Systems [3.8719670789415925]
This paper proposes a novel deep reservoir computing framework, termed deep recurrent configuration network (DeepRSCN) DeepRSCNs are incrementally constructed, with all reservoir nodes directly linked to the final output. Given a set of training samples, DeepRSCNs can quickly generate learning representations, which consist of random basis functions with cascaded input readout weights.
arXiv Detail & Related papers (2024-10-28T10:33:15Z)
eXponential FAmily Dynamical Systems (XFADS): Large-scale nonlinear Gaussian state-space modeling [9.52474299688276]
We introduce a low-rank structured variational autoencoder framework for nonlinear state-space graphical models. We show that our approach consistently demonstrates the ability to learn a more predictive generative model.
arXiv Detail & Related papers (2024-03-03T02:19:49Z)
ConCerNet: A Contrastive Learning Based Framework for Automated Conservation Law Discovery and Trustworthy Dynamical System Prediction [82.81767856234956]
This paper proposes a new learning framework named ConCerNet to improve the trustworthiness of the DNN based dynamics modeling. We show that our method consistently outperforms the baseline neural networks in both coordinate error and conservation metrics.
arXiv Detail & Related papers (2023-02-11T21:07:30Z)
Tractable Dendritic RNNs for Reconstructing Nonlinear Dynamical Systems [7.045072177165241]
We augment a piecewise-linear recurrent neural network (RNN) by a linear spline basis expansion. We show that this approach retains all the theoretically appealing properties of the simple PLRNN, yet boosts its capacity for approximating arbitrary nonlinear dynamical systems in comparatively low dimensions.
arXiv Detail & Related papers (2022-07-06T09:43:03Z)
PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive Learning [109.84770951839289]
We present PredRNN, a new recurrent network for learning visual dynamics from historical context. We show that our approach obtains highly competitive results on three standard datasets.
arXiv Detail & Related papers (2021-03-17T08:28:30Z)
Liquid Time-constant Networks [117.57116214802504]
We introduce a new class of time-continuous recurrent neural network models. Instead of declaring a learning system's dynamics by implicit nonlinearities, we construct networks of linear first-order dynamical systems. These neural networks exhibit stable and bounded behavior, yield superior expressivity within the family of neural ordinary differential equations.
arXiv Detail & Related papers (2020-06-08T09:53:35Z)
Revisiting Initialization of Neural Networks [72.24615341588846]
We propose a rigorous estimation of the global curvature of weights across layers by approximating and controlling the norm of their Hessian matrix. Our experiments on Word2Vec and the MNIST/CIFAR image classification tasks confirm that tracking the Hessian norm is a useful diagnostic tool.
arXiv Detail & Related papers (2020-04-20T18:12:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.