Learning to Dissipate Energy in Oscillatory State-Space Models
- URL: http://arxiv.org/abs/2505.12171v1
- Date: Sat, 17 May 2025 23:15:17 GMT
- Title: Learning to Dissipate Energy in Oscillatory State-Space Models
- Authors: Jared Boyer, T. Konstantin Rusch, Daniela Rus,
- Abstract summary: State-space models (SSMs) are a class of networks for sequence learning.<n>We show that D-LinOSS consistently outperforms previous LinOSS methods on long-range learning tasks.
- Score: 55.09730499143998
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: State-space models (SSMs) are a class of networks for sequence learning that benefit from fixed state size and linear complexity with respect to sequence length, contrasting the quadratic scaling of typical attention mechanisms. Inspired from observations in neuroscience, Linear Oscillatory State-Space models (LinOSS) are a recently proposed class of SSMs constructed from layers of discretized forced harmonic oscillators. Although these models perform competitively, leveraging fast parallel scans over diagonal recurrent matrices and achieving state-of-the-art performance on tasks with sequence length up to 50k, LinOSS models rely on rigid energy dissipation ("forgetting") mechanisms that are inherently coupled to the timescale of state evolution. As forgetting is a crucial mechanism for long-range reasoning, we demonstrate the representational limitations of these models and introduce Damped Linear Oscillatory State-Space models (D-LinOSS), a more general class of oscillatory SSMs that learn to dissipate latent state energy on multiple timescales. We analyze the spectral distribution of the model's recurrent matrices and prove that the SSM layers exhibit stable dynamics under simple, flexible parameterizations. D-LinOSS consistently outperforms previous LinOSS methods on long-range learning tasks, without introducing additional complexity, and simultaneously reduces the hyperparameter search space by 50%.
Related papers
- Structured State Space Model Dynamics and Parametrization for Spiking Neural Networks [0.8321953606016751]
Multi-state spiking neurons offer compelling alternatives to conventional deep learning models.<n>State space models (SSMs) excel in long sequence processing using linear state-intrinsic recurrence resembling spiking neurons' subthreshold regime.<n>Here, we establish a mathematical bridge between SSMs and second-order spiking neuron models.
arXiv Detail & Related papers (2025-06-04T13:54:02Z) - Deep Learning-based Approaches for State Space Models: A Selective Review [15.295157876811066]
State-space models (SSMs) offer a powerful framework for dynamical system analysis.<n>This paper provides a selective review of recent advancements in deep neural network-based approaches for SSMs.
arXiv Detail & Related papers (2024-12-15T15:04:35Z) - Oscillatory State-Space Models [61.923849241099184]
We propose Lineary State-Space models (LinOSS) for efficiently learning on long sequences.<n>A stable discretization, integrated over time using fast associative parallel scans, yields the proposed state-space model.<n>We show that LinOSS is universal, i.e., it can approximate any continuous and causal operator mapping between time-varying functions.
arXiv Detail & Related papers (2024-10-04T22:00:13Z) - Latent Space Energy-based Neural ODEs [73.01344439786524]
This paper introduces novel deep dynamical models designed to represent continuous-time sequences.<n>We train the model using maximum likelihood estimation with Markov chain Monte Carlo.<n> Experimental results on oscillating systems, videos and real-world state sequences (MuJoCo) demonstrate that our model with the learnable energy-based prior outperforms existing counterparts.
arXiv Detail & Related papers (2024-09-05T18:14:22Z) - Longhorn: State Space Models are Amortized Online Learners [51.10124201221601]
State-space models (SSMs) offer linear decoding efficiency while maintaining parallelism during training.
In this work, we explore SSM design through the lens of online learning, conceptualizing SSMs as meta-modules for specific online learning problems.
We introduce a novel deep SSM architecture, Longhorn, whose update resembles the closed-form solution for solving the online associative recall problem.
arXiv Detail & Related papers (2024-07-19T11:12:08Z) - SMR: State Memory Replay for Long Sequence Modeling [19.755738298836526]
This paper proposes a novel non-recursive non-uniform sample processing strategy to overcome compatibility limitations in parallel convolutional computation.
We introduce State Memory Replay (SMR), which utilizes learnable memories to adjust the current state with multi-step information for generalization at sampling points different from those in the training data.
Experiments on long-range modeling tasks in autoregressive language modeling and Long Range Arena demonstrate the general effectiveness of the SMR mechanism for a series of SSM models.
arXiv Detail & Related papers (2024-05-27T17:53:32Z) - HOPE for a Robust Parameterization of Long-memory State Space Models [51.66430224089725]
State-space models (SSMs) that utilize linear, time-invariant (LTI) systems are known for their effectiveness in learning long sequences.
We develop a new parameterization scheme, called HOPE, for LTI systems that utilize Markov parameters within Hankel operators.
Our new parameterization endows the SSM with non-decaying memory within a fixed time window, which is empirically corroborated by a sequential CIFAR-10 task with padded noise.
arXiv Detail & Related papers (2024-05-22T20:20:14Z) - Machine-Learned Closure of URANS for Stably Stratified Turbulence: Connecting Physical Timescales & Data Hyperparameters of Deep Time-Series Models [0.0]
We develop time-series machine learning (ML) methods for closure modeling of the Unsteady Reynolds Averaged Navier Stokes equations.
We consider decaying SST which are homogeneous and stably stratified by a uniform density gradient.
We find that the ratio of the timescales of the minimum information required by the ML models to accurately capture the dynamics of the SST corresponds to the Reynolds number of the flow.
arXiv Detail & Related papers (2024-04-24T18:58:00Z) - Convolutional State Space Models for Long-Range Spatiotemporal Modeling [65.0993000439043]
ConvS5 is an efficient variant for long-rangetemporal modeling.
It significantly outperforms Transformers and ConvNISTTM on a long horizon Moving-Lab experiment while training 3X faster than ConvLSTM and generating samples 400X faster than Transformers.
arXiv Detail & Related papers (2023-10-30T16:11:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.