Fast weight programming and linear transformers: from machine learning to neurobiology
- URL: http://arxiv.org/abs/2508.08435v2
- Date: Wed, 05 Nov 2025 16:40:49 GMT
- Title: Fast weight programming and linear transformers: from machine learning to neurobiology
- Authors: Kazuki Irie, Samuel J. Gershman,
- Abstract summary: Recent advances in artificial neural networks for machine learning have established a family of recurrent neural network (RNN) architectures.<n>Fast Weight Programmers (FWPs) can be interpreted as a neural network whose synaptic weights dynamically change over time as a function of input observations.<n>We discuss connections between FWPs and models of synaptic plasticity in the brain, suggesting a convergence of natural and artificial intelligence.
- Score: 20.18882578320406
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advances in artificial neural networks for machine learning, and language modeling in particular, have established a family of recurrent neural network (RNN) architectures that, unlike conventional RNNs with vector-form hidden states, use two-dimensional (2D) matrix-form hidden states. Such 2D-state RNNs, known as Fast Weight Programmers (FWPs), can be interpreted as a neural network whose synaptic weights (called fast weights) dynamically change over time as a function of input observations, and serve as short-term memory storage; corresponding synaptic weight modifications are controlled or programmed by another network (the programmer) whose parameters are trained (e.g., by gradient descent). In this Primer, we review the technical foundations of FWPs, their computational characteristics, and their connections to transformers and state space models. We also discuss connections between FWPs and models of synaptic plasticity in the brain, suggesting a convergence of natural and artificial intelligence.
Related papers
- Integrating programmable plasticity in experiment descriptions for analog neuromorphic hardware [0.9217021281095907]
The BrainScaleS-2 neuromorphic architecture has been designed to support "hybrid" plasticity.<n> observables that are expensive in numerical simulation, such as per-synapse correlation measurements, are implemented directly in the synapse circuits.<n>We introduce an integrated framework for describing spiking neural network experiments and plasticity rules in a unified high-level experiment description language.
arXiv Detail & Related papers (2024-12-04T08:46:06Z) - Scalable Mechanistic Neural Networks for Differential Equations and Machine Learning [52.28945097811129]
We propose an enhanced neural network framework designed for scientific machine learning applications involving long temporal sequences.<n>We reduce the computational time and space complexities from cubic and quadratic with respect to the sequence length, respectively, to linear.<n>Extensive experiments demonstrate that S-MNN matches the original MNN in precision while substantially reducing computational resources.
arXiv Detail & Related papers (2024-10-08T14:27:28Z) - Contrastive Learning in Memristor-based Neuromorphic Systems [55.11642177631929]
Spiking neural networks have become an important family of neuron-based models that sidestep many of the key limitations facing modern-day backpropagation-trained deep networks.
In this work, we design and investigate a proof-of-concept instantiation of contrastive-signal-dependent plasticity (CSDP), a neuromorphic form of forward-forward-based, backpropagation-free learning.
arXiv Detail & Related papers (2024-09-17T04:48:45Z) - Unsupervised representation learning with Hebbian synaptic and structural plasticity in brain-like feedforward neural networks [0.0]
We introduce and evaluate a brain-like neural network model capable of unsupervised representation learning.<n>The model was tested on a diverse set of popular machine learning benchmarks.
arXiv Detail & Related papers (2024-06-07T08:32:30Z) - Interpolating neural network: A novel unification of machine learning and interpolation theory [6.778453409974683]
We introduce an interpolating neural network (INN) to realize Engineering Software 2.0.
INN offers fewer trainable/solvable parameters for comparable model accuracy than traditional multi-layer perceptron (MLP) or physics-informed neural networks (PINN)
arXiv Detail & Related papers (2024-04-16T05:40:30Z) - A survey on learning models of spiking neural membrane systems and spiking neural networks [0.0]
Spiking neural networks (SNN) are a biologically inspired model of neural networks with certain brain-like properties.
In SNN, communication between neurons takes place through the spikes and spike trains.
SNPS can be considered a branch of SNN based more on the principles of formal automata.
arXiv Detail & Related papers (2024-03-27T14:26:41Z) - ANTN: Bridging Autoregressive Neural Networks and Tensor Networks for Quantum Many-Body Simulation [5.283885355422517]
We develop a novel architecture, Autoregressive NeuralNet, which bridges tensor networks and autoregressive neural networks.
We show that Autoregressive NeuralNet parameterizes normalized wavefunctions, generalizes the expressivity of tensor networks and autoregressive neural networks, and inherits a variety of symmetries from autoregressive neural networks.
Our work opens up new opportunities for quantum many-body physics simulation, quantum technology design, and generative modeling in artificial intelligence.
arXiv Detail & Related papers (2023-04-04T17:54:14Z) - Learning to Control Rapidly Changing Synaptic Connections: An
Alternative Type of Memory in Sequence Processing Artificial Neural Networks [9.605853974038936]
Generalising feedforward NNs to such RNNs is mathematically straightforward and natural, and even historical.
A lesser known alternative approach to storing short-term memory in "synaptic connections" yields another "natural" type of short-term memory in sequence processing NNs.
Fast Weight Programmers (FWPs) have seen a recent revival as generic sequence processors, achieving competitive performance across various tasks.
arXiv Detail & Related papers (2022-11-17T10:03:54Z) - Neuromorphic Artificial Intelligence Systems [58.1806704582023]
Modern AI systems, based on von Neumann architecture and classical neural networks, have a number of fundamental limitations in comparison with the brain.
This article discusses such limitations and the ways they can be mitigated.
It presents an overview of currently available neuromorphic AI projects in which these limitations are overcome.
arXiv Detail & Related papers (2022-05-25T20:16:05Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - The BrainScaleS-2 accelerated neuromorphic system with hybrid plasticity [0.0]
We describe the second generation of the BrainScaleS neuromorphic architecture, emphasizing applications enabled by this architecture.
It combines a custom accelerator core supporting the accelerated physical emulation of bio-inspired spiking neural network primitives with a tightly coupled digital processor and a digital event-routing network.
arXiv Detail & Related papers (2022-01-26T17:13:46Z) - Progressive Tandem Learning for Pattern Recognition with Deep Spiking
Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency.
We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z) - Flexible Transmitter Network [84.90891046882213]
Current neural networks are mostly built upon the MP model, which usually formulates the neuron as executing an activation function on the real-valued weighted aggregation of signals received from other neurons.
We propose the Flexible Transmitter (FT) model, a novel bio-plausible neuron model with flexible synaptic plasticity.
We present the Flexible Transmitter Network (FTNet), which is built on the most common fully-connected feed-forward architecture.
arXiv Detail & Related papers (2020-04-08T06:55:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.