Long-Context Linear System Identification
- URL: http://arxiv.org/abs/2410.05690v1
- Date: Tue, 8 Oct 2024 05:15:21 GMT
- Title: Long-Context Linear System Identification
- Authors: Oğuz Kaan Yüksel, Mathieu Even, Nicolas Flammarion,
- Abstract summary: This paper addresses the problem of long-context linear system identification, where the state $x_t$ of a dynamical system at time $t$ depends linearly on previous states $x_s$ over a fixed context window of length $p$.
We establish a sample complexity bound that matches the i.i.d. parametric rate up to logarithmic factors for a broad class of systems, extending previous works that considered only first-order dependencies.
- Score: 20.835344826113307
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper addresses the problem of long-context linear system identification, where the state $x_t$ of a dynamical system at time $t$ depends linearly on previous states $x_s$ over a fixed context window of length $p$. We establish a sample complexity bound that matches the i.i.d. parametric rate up to logarithmic factors for a broad class of systems, extending previous works that considered only first-order dependencies. Our findings reveal a learning-without-mixing phenomenon, indicating that learning long-context linear autoregressive models is not hindered by slow mixing properties potentially associated with extended context windows. Additionally, we extend these results to (i) shared low-rank representations, where rank-regularized estimators improve rates with respect to dimensionality, and (ii) misspecified context lengths in strictly stable systems, where shorter contexts offer statistical advantages.
Related papers
- Generalized Linear Bandits: Almost Optimal Regret with One-Pass Update [60.414548453838506]
We study the generalized linear bandit (GLB) problem, a contextual multi-armed bandit framework that extends the classical linear model by incorporating a non-linear link function.<n>GLBs are widely applicable to real-world scenarios, but their non-linear nature introduces significant challenges in achieving both computational and statistical efficiency.<n>We propose a jointly efficient algorithm that attains a nearly optimal regret bound with $mathcalO(1)$ time and space complexities per round.
arXiv Detail & Related papers (2025-07-16T02:24:21Z) - A theoretical framework for self-supervised contrastive learning for continuous dependent data [86.50780641055258]
Self-supervised learning (SSL) has emerged as a powerful approach to learning representations, particularly in the field of computer vision.<n>We propose a novel theoretical framework for contrastive SSL tailored to emphsemantic independence between samples.<n>Specifically, we outperform TS2Vec on the standard UEA and UCR benchmarks, with accuracy improvements of $4.17$% and $2.08$%, respectively.
arXiv Detail & Related papers (2025-06-11T14:23:47Z) - The Sample Complexity of Online Reinforcement Learning: A Multi-model Perspective [55.15192437680943]
We study the sample complexity of online reinforcement learning for nonlinear dynamical systems with continuous state and action spaces.
Our algorithms are likely to be useful in practice, due to their simplicity, the ability to incorporate prior knowledge, and their benign transient behavior.
arXiv Detail & Related papers (2025-01-27T10:01:28Z) - An L-BFGS-B approach for linear and nonlinear system identification under $\ell_1$- and group-Lasso regularization [0.0]
We propose a very efficient numerical method for identifying linear and nonlinear discrete-time state-space models.
A Python implementation of the proposed identification method is available in the package jax-sysid.
arXiv Detail & Related papers (2024-03-06T16:17:34Z) - Stochastic Contextual Bandits with Long Horizon Rewards [31.981315673799806]
This work takes a step in this direction by investigating contextual linear bandits where the current reward depends on most $s$ prior actions.
We propose new algorithms that leverage sparsity to discover the dependence pattern and arm parameters jointly.
Our results necessitate new analysis to address long-range temporal dependencies across data and avoid dependence on the reward horizon $h$.
arXiv Detail & Related papers (2023-02-02T01:25:30Z) - Identifiability and Asymptotics in Learning Homogeneous Linear ODE Systems from Discrete Observations [114.17826109037048]
Ordinary Differential Equations (ODEs) have recently gained a lot of attention in machine learning.
theoretical aspects, e.g., identifiability and properties of statistical estimation are still obscure.
This paper derives a sufficient condition for the identifiability of homogeneous linear ODE systems from a sequence of equally-spaced error-free observations sampled from a single trajectory.
arXiv Detail & Related papers (2022-10-12T06:46:38Z) - Learning with little mixing [9.574025613149871]
We study square loss in a realizable time-series framework with martingale difference noise.
Our main result is a fast rate excess risk bound which shows that whenever a trajectory hypercontractivity condition holds, the risk of the least-squares estimator on dependent data matches the iid rate order-wise after a burn-in time.
arXiv Detail & Related papers (2022-06-16T16:06:47Z) - Online Control of Unknown Time-Varying Dynamical Systems [48.75672260851758]
We study online control of time-varying linear systems with unknown dynamics in the nonstochastic control model.
We study regret bounds with respect to common classes of policies: Disturbance Action (SLS), Disturbance Response (Youla), and linear feedback policies.
arXiv Detail & Related papers (2022-02-16T06:57:14Z) - A Priori Denoising Strategies for Sparse Identification of Nonlinear
Dynamical Systems: A Comparative Study [68.8204255655161]
We investigate and compare the performance of several local and global smoothing techniques to a priori denoise the state measurements.
We show that, in general, global methods, which use the entire measurement data set, outperform local methods, which employ a neighboring data subset around a local point.
arXiv Detail & Related papers (2022-01-29T23:31:25Z) - Consistency and Rate of Convergence of Switched Least Squares System
Identification for Autonomous Switched Linear Systems [1.1470070927586016]
We propose switched least squares method for the identification for switched linear systems.
Our data-dependent rate of convergence shows that, almost surely, the system identification error is $mathcalObig(sqrtlog(T)/T big)$ where $T$ is the time horizon.
arXiv Detail & Related papers (2021-12-20T18:56:29Z) - Consistency of mechanistic causal discovery in continuous-time using
Neural ODEs [85.7910042199734]
We consider causal discovery in continuous-time for the study of dynamical systems.
We propose a causal discovery algorithm based on penalized Neural ODEs.
arXiv Detail & Related papers (2021-05-06T08:48:02Z) - Improved rates for prediction and identification of partially observed
linear dynamical systems [4.68299658663016]
Identification of a linear time-in dynamical system from partial observations is a fundamental problem in control theory.
We propose an algorithm that learns such systems with non-asymptotic statistical rates depending on the inherentity $d$ of the system.
Our algorithm is based on multi-scale low-rank approximation SVD applied to Hankel matrices of increasing sizes.
arXiv Detail & Related papers (2020-11-19T18:04:18Z) - High-Dimensional Sparse Linear Bandits [67.9378546011416]
We derive a novel $Omega(n2/3)$ dimension-free minimax regret lower bound for sparse linear bandits in the data-poor regime.
We also prove a dimension-free $O(sqrtn)$ regret upper bound under an additional assumption on the magnitude of the signal for relevant features.
arXiv Detail & Related papers (2020-11-08T16:48:11Z) - Liquid Time-constant Networks [117.57116214802504]
We introduce a new class of time-continuous recurrent neural network models.
Instead of declaring a learning system's dynamics by implicit nonlinearities, we construct networks of linear first-order dynamical systems.
These neural networks exhibit stable and bounded behavior, yield superior expressivity within the family of neural ordinary differential equations.
arXiv Detail & Related papers (2020-06-08T09:53:35Z) - No-Regret Prediction in Marginally Stable Systems [37.178095559618654]
We consider the problem of online prediction in a marginally stable linear dynamical system.
By applying our techniques to learning an autoregressive filter, we also achieve logarithmic regret in the partially observed setting.
arXiv Detail & Related papers (2020-02-06T01:53:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.