Improved rates for prediction and identification of partially observed
linear dynamical systems
- URL: http://arxiv.org/abs/2011.10006v3
- Date: Mon, 7 Mar 2022 18:15:21 GMT
- Title: Improved rates for prediction and identification of partially observed
linear dynamical systems
- Authors: Holden Lee
- Abstract summary: Identification of a linear time-in dynamical system from partial observations is a fundamental problem in control theory.
We propose an algorithm that learns such systems with non-asymptotic statistical rates depending on the inherentity $d$ of the system.
Our algorithm is based on multi-scale low-rank approximation SVD applied to Hankel matrices of increasing sizes.
- Score: 4.68299658663016
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Identification of a linear time-invariant dynamical system from partial
observations is a fundamental problem in control theory. Particularly
challenging are systems exhibiting long-term memory. A natural question is how
learn such systems with non-asymptotic statistical rates depending on the
inherent dimensionality (order) $d$ of the system, rather than on the possibly
much larger memory length. We propose an algorithm that given a single
trajectory of length $T$ with gaussian observation noise, learns the system
with a near-optimal rate of $\widetilde O\left(\sqrt\frac{d}{T}\right)$ in
$\mathcal{H}_2$ error, with only logarithmic, rather than polynomial dependence
on memory length. We also give bounds under process noise and improved bounds
for learning a realization of the system. Our algorithm is based on multi-scale
low-rank approximation: SVD applied to Hankel matrices of geometrically
increasing sizes. Our analysis relies on careful application of concentration
bounds on the Fourier domain -- we give sharper concentration bounds for sample
covariance of correlated inputs and for $\mathcal H_\infty$ norm estimation,
which may be of independent interest.
Related papers
- From Spectral Theorem to Statistical Independence with Application to
System Identification [11.98319841778396]
We provide first quantitative handle on decay rate of finite powers of state transition matrix $|Ak|$.
It is shown that when a stable dynamical system has only one distinct eigenvalue and discrepancy of $n-1$: $|A|$ has a dependence on $n$, resulting dynamics are inseparable.
We show that element-wise error is essentially a variant of well-know Littlewood-Offord problem.
arXiv Detail & Related papers (2023-10-16T15:40:43Z) - Sparse random matrices and Gaussian ensembles with varying randomness [0.0]
We study a system of $N$ qubits with a random Hamiltonian obtained by drawing coupling constants from Gaussian distributions in various ways.
We find some evidence that the behaviour changes in an abrupt manner when the number of non-zero independent terms in the Hamiltonian is exponentially large in $N$.
arXiv Detail & Related papers (2023-05-12T14:19:27Z) - High-Dimensional Smoothed Entropy Estimation via Dimensionality
Reduction [14.53979700025531]
We consider the estimation of the differential entropy $h(X+Z)$ via $n$ independently and identically distributed samples of $X$.
Under the absolute-error loss, the above problem has a parametric estimation rate of $fraccDsqrtn$.
We overcome this exponential sample complexity by projecting $X$ to a low-dimensional space via principal component analysis (PCA) before the entropy estimation.
arXiv Detail & Related papers (2023-05-08T13:51:48Z) - Kernel-based off-policy estimation without overlap: Instance optimality
beyond semiparametric efficiency [53.90687548731265]
We study optimal procedures for estimating a linear functional based on observational data.
For any convex and symmetric function class $mathcalF$, we derive a non-asymptotic local minimax bound on the mean-squared error.
arXiv Detail & Related papers (2023-01-16T02:57:37Z) - Precise Learning Curves and Higher-Order Scaling Limits for Dot Product
Kernel Regression [41.48538038768993]
We focus on the problem of kernel ridge regression for dot-product kernels.
We observe a peak in the learning curve whenever $m approx dr/r!$ for any integer $r$, leading to multiple sample-wise descent and nontrivial behavior at multiple scales.
arXiv Detail & Related papers (2022-05-30T04:21:31Z) - Scaling Structured Inference with Randomization [64.18063627155128]
We propose a family of dynamic programming (RDP) randomized for scaling structured models to tens of thousands of latent states.
Our method is widely applicable to classical DP-based inference.
It is also compatible with automatic differentiation so can be integrated with neural networks seamlessly.
arXiv Detail & Related papers (2021-12-07T11:26:41Z) - Multipole Graph Neural Operator for Parametric Partial Differential
Equations [57.90284928158383]
One of the main challenges in using deep learning-based methods for simulating physical systems is formulating physics-based data.
We propose a novel multi-level graph neural network framework that captures interaction at all ranges with only linear complexity.
Experiments confirm our multi-graph network learns discretization-invariant solution operators to PDEs and can be evaluated in linear time.
arXiv Detail & Related papers (2020-06-16T21:56:22Z) - Tight Nonparametric Convergence Rates for Stochastic Gradient Descent
under the Noiseless Linear Model [0.0]
We analyze the convergence of single-pass, fixed step-size gradient descent on the least-square risk under this model.
As a special case, we analyze an online algorithm for estimating a real function on the unit interval from the noiseless observation of its value at randomly sampled points.
arXiv Detail & Related papers (2020-06-15T08:25:50Z) - Learning nonlinear dynamical systems from a single trajectory [102.60042167341956]
We introduce algorithms for learning nonlinear dynamical systems of the form $x_t+1=sigma(Thetastarx_t)+varepsilon_t$.
We give an algorithm that recovers the weight matrix $Thetastar$ from a single trajectory with optimal sample complexity and linear running time.
arXiv Detail & Related papers (2020-04-30T10:42:48Z) - On Learning Rates and Schr\"odinger Operators [105.32118775014015]
We present a general theoretical analysis of the effect of the learning rate.
We find that the learning rate tends to zero for a broad non- neural class functions.
arXiv Detail & Related papers (2020-04-15T09:52:37Z) - Naive Exploration is Optimal for Online LQR [49.681825576239355]
We show that the optimal regret scales as $widetildeTheta(sqrtd_mathbfu2 d_mathbfx T)$, where $T$ is the number of time steps, $d_mathbfu$ is the dimension of the input space, and $d_mathbfx$ is the dimension of the system state.
Our lower bounds rule out the possibility of a $mathrmpoly(logT)$-regret algorithm, which had been
arXiv Detail & Related papers (2020-01-27T03:44:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.