Related papers: Regularity and Stability Properties of Selective SSMs with Discontinuous Gating

Regularity and Stability Properties of Selective SSMs with Discontinuous Gating

URL: http://arxiv.org/abs/2505.11602v1
Date: Fri, 16 May 2025 18:08:40 GMT
Title: Regularity and Stability Properties of Selective SSMs with Discontinuous Gating
Authors: Nikola Zubić, Davide Scaramuzza,
Abstract summary: In this paper, we investigate the stability and regularity properties of continuous-time selective SSMs.<n>We establish that intrinsic energy dissipation guarantees exponential forgetting of past states.<n>Our findings offer a rigorous framework for understanding and designing stable and reliable deep selective SSMs.
Score: 18.718025325906762
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Deep Selective State-Space Models (SSMs), characterized by input-dependent, time-varying parameters, offer significant expressive power but pose challenges for stability analysis, especially with discontinuous gating signals. In this paper, we investigate the stability and regularity properties of continuous-time selective SSMs through the lens of passivity and Input-to-State Stability (ISS). We establish that intrinsic energy dissipation guarantees exponential forgetting of past states. Crucially, we prove that the unforced system dynamics possess an underlying minimal quadratic energy function whose defining matrix exhibits robust $\text{AUC}_{\text{loc}}$ regularity, accommodating discontinuous gating. Furthermore, assuming a universal quadratic storage function ensures passivity across all inputs, we derive parametric LMI conditions and kernel constraints that limit gating mechanisms, formalizing "irreversible forgetting" of recurrent models. Finally, we provide sufficient conditions for global ISS, linking uniform local dissipativity to overall system robustness. Our findings offer a rigorous framework for understanding and designing stable and reliable deep selective SSMs.

Related papers

The Vanishing Gradient Problem for Stiff Neural Differential Equations [3.941173292703699]
In stiff systems, it has been observed that sensitivities to parameters controlling fast-decaying modes become vanishingly small during training.<n>We show that this vanishing gradient phenomenon is not an artifact of any particular method, but a universal feature of all A-stable and L-stable stiff numerical integration schemes.
arXiv Detail & Related papers (2025-08-02T23:44:14Z)
Canonical Bayesian Linear System Identification [2.60567273797562]
We introduce canonical forms of LTI systems within the Bayesian framework.<n>We rigorously establish that inference in these minimal parameterizations fully captures all invariant system dynamics.<n>This approach unlocks the use of meaningful structure-aware priors.
arXiv Detail & Related papers (2025-07-15T17:58:55Z)
Transformers Learn Faster with Semantic Focus [57.97235825738412]
We study sparse transformers in terms of learnability and generalization.<n>We find that input-dependent sparse attention models appear to converge faster and generalize better than standard attention models.
arXiv Detail & Related papers (2025-06-17T01:19:28Z)
Learning to Dissipate Energy in Oscillatory State-Space Models [55.09730499143998]
State-space models (SSMs) are a class of networks for sequence learning.<n>We show that D-LinOSS consistently outperforms previous LinOSS methods on long-range learning tasks.
arXiv Detail & Related papers (2025-05-17T23:15:17Z)
Regulating Model Reliance on Non-Robust Features by Smoothing Input Marginal Density [93.32594873253534]
Trustworthy machine learning requires meticulous regulation of model reliance on non-robust features. We propose a framework to delineate and regulate such features by attributing model predictions to the input.
arXiv Detail & Related papers (2024-07-05T09:16:56Z)
Defining stable phases of open quantum systems [0.0]
We show that uniformity is satisfied in a canonical classical cellular automaton. We conjecture some sufficient conditions for a channel to exhibit uniformity and therefore stability.
arXiv Detail & Related papers (2023-08-28T17:55:31Z)
Spectral stabilizability [0.0]
We develop conditions for stabilizability based on the target state's eigendecomposition. We use the spectral approach to derive upper bounds on stabilizability for a number of exemplary open system scenarios.
arXiv Detail & Related papers (2022-12-23T10:38:31Z)
Formal Controller Synthesis for Markov Jump Linear Systems with Uncertain Dynamics [64.72260320446158]
We propose a method for synthesising controllers for Markov jump linear systems. Our method is based on a finite-state abstraction that captures both the discrete (mode-jumping) and continuous (stochastic linear) behaviour of the MJLS. We apply our method to multiple realistic benchmark problems, in particular, a temperature control and an aerial vehicle delivery problem.
arXiv Detail & Related papers (2022-12-01T17:36:30Z)
On the Sample Complexity and Metastability of Heavy-tailed Policy Search in Continuous Control [47.71156648737803]
Reinforcement learning is a framework for interactive decision-making with incentives sequentially revealed across time without a system dynamics model. We characterize a defined defined chain, identifying that policies associated with Levy Processes of a tail index yield to wider peaks.
arXiv Detail & Related papers (2021-06-15T20:12:44Z)
Stability and Identification of Random Asynchronous Linear Time-Invariant Systems [81.02274958043883]
We show the additional benefits of randomization and asynchrony on the stability of linear dynamical systems. For unknown randomized LTI systems, we propose a systematic identification method to recover the underlying dynamics.
arXiv Detail & Related papers (2020-12-08T02:00:04Z)
Fine-Grained Analysis of Stability and Generalization for Stochastic Gradient Descent [55.85456985750134]
We introduce a new stability measure called on-average model stability, for which we develop novel bounds controlled by the risks of SGD iterates. This yields generalization bounds depending on the behavior of the best model, and leads to the first-ever-known fast bounds in the low-noise setting. To our best knowledge, this gives the firstever-known stability and generalization for SGD with even non-differentiable loss functions.
arXiv Detail & Related papers (2020-06-15T06:30:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.