Logarithmic Regret and Polynomial Scaling in Online Multi-step-ahead Prediction
- URL: http://arxiv.org/abs/2511.12467v1
- Date: Sun, 16 Nov 2025 05:49:44 GMT
- Title: Logarithmic Regret and Polynomial Scaling in Online Multi-step-ahead Prediction
- Authors: Jiachen Qian, Yang Zheng,
- Abstract summary: We study the problem of online multi-step-ahead prediction for unknown linear systems.<n>Using conditional distribution theory, we derive an optimal parameterization of the prediction policy as a linear function of future inputs, past inputs, and past outputs.<n>We show that the online algorithm achieves logarithmic regret with respect to the optimal Kalman filter in the multi-step setting.
- Score: 4.42171635795004
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This letter studies the problem of online multi-step-ahead prediction for unknown linear stochastic systems. Using conditional distribution theory, we derive an optimal parameterization of the prediction policy as a linear function of future inputs, past inputs, and past outputs. Based on this characterization, we propose an online least-squares algorithm to learn the policy and analyze its regret relative to the optimal model-based predictor. We show that the online algorithm achieves logarithmic regret with respect to the optimal Kalman filter in the multi-step setting. Furthermore, with new proof techniques, we establish an almost-sure regret bound that does not rely on fixed failure probabilities for sufficiently large horizons $N$. Finally, our analysis also reveals that, while the regret remains logarithmic in $N$, its constant factor grows polynomially with the prediction horizon $H$, with the polynomial order set by the largest Jordan block of eigenvalue 1 in the system matrix.
Related papers
- A Polynomial-time Algorithm for Online Sparse Linear Regression with Improved Regret Bound under Weaker Conditions [75.69959433669244]
We study the problem of online sparse linear regression (OSLR) where the algorithms are restricted to accessing only $k$ out of $d$ per instance for prediction.<n>We introduce a new extend-time algorithm, which significantly improves previous regret bounds.
arXiv Detail & Related papers (2025-10-31T05:02:33Z) - Generalized Linear Bandits: Almost Optimal Regret with One-Pass Update [70.38810219913593]
We study the generalized linear bandit (GLB) problem, a contextual multi-armed bandit framework that extends the classical linear model by incorporating a non-linear link function.<n>GLBs are widely applicable to real-world scenarios, but their non-linear nature introduces significant challenges in achieving both computational and statistical efficiency.<n>We propose a jointly efficient algorithm that attains a nearly optimal regret bound with $mathcalO(1)$ time and space complexities per round.
arXiv Detail & Related papers (2025-07-16T02:24:21Z) - Follow The Approximate Sparse Leader for No-Regret Online Sparse Linear Approximation [0.46040036610482665]
We consider the problem of textitonline sparse linear approximation, where one predicts the best sparse approximation of a sequence of measurements in terms of linear combination of columns of a given measurement matrix.<n>We propose Follow-The-Approximate-Sparse-Leader, an efficient online meta-policy to address this online problem.
arXiv Detail & Related papers (2025-01-01T10:50:35Z) - Sparse high-dimensional linear regression with a partitioned empirical
Bayes ECM algorithm [62.997667081978825]
We propose a computationally efficient and powerful Bayesian approach for sparse high-dimensional linear regression.
Minimal prior assumptions on the parameters are used through the use of plug-in empirical Bayes estimates.
The proposed approach is implemented in the R package probe.
arXiv Detail & Related papers (2022-09-16T19:15:50Z) - Online Contextual Decision-Making with a Smart Predict-then-Optimize
Method [4.061135251278187]
We study an online contextual decision-making problem with resource constraints.
We propose an algorithm that mixes a prediction step based on the "Smart Predict-then- (SPO)" method with a dual update step based on mirror descent.
We prove regret bounds and demonstrate that the overall convergence rate of our method depends on the $mathcalO(T-1/2)$ convergence of online mirror descent.
arXiv Detail & Related papers (2022-06-15T06:16:13Z) - Online estimation and control with optimal pathlength regret [52.28457815067461]
A natural goal when designing online learning algorithms is to bound the regret of the algorithm in terms of the temporal variation of the input sequence.
Data-dependent "pathlength" regret bounds have recently been obtained for a wide variety of online learning problems, including OCO and bandits.
arXiv Detail & Related papers (2021-10-24T22:43:15Z) - Upper Confidence Primal-Dual Reinforcement Learning for CMDP with
Adversarial Loss [145.54544979467872]
We consider online learning for episodically constrained Markov decision processes (CMDPs)
We propose a new emphupper confidence primal-dual algorithm, which only requires the trajectories sampled from the transition model.
Our analysis incorporates a new high-probability drift analysis of Lagrange multiplier processes into the celebrated regret analysis of upper confidence reinforcement learning.
arXiv Detail & Related papers (2020-03-02T05:02:23Z) - Online Learning of the Kalman Filter with Logarithmic Regret [2.0305676256390934]
We show that it is possible to achieve a regret of the order of $mathrmpolylog(N)$ with high probability, where $N$ is the number of observations collected.
This is achieved using an online least-squares algorithm, which exploits the approximately linear relation between future observations and past observations.
arXiv Detail & Related papers (2020-02-12T18:31:31Z) - No-Regret Prediction in Marginally Stable Systems [37.178095559618654]
We consider the problem of online prediction in a marginally stable linear dynamical system.
By applying our techniques to learning an autoregressive filter, we also achieve logarithmic regret in the partially observed setting.
arXiv Detail & Related papers (2020-02-06T01:53:34Z) - Toward Optimal Adversarial Policies in the Multiplicative Learning
System with a Malicious Expert [87.12201611818698]
We consider a learning system that combines experts' advice to predict a sequence of true outcomes.
It is assumed that one of the experts is malicious and aims to impose the maximum loss on the system.
We show that a simple greedy policy of always reporting false prediction is optimal with an approximation ratio of $1+O(sqrtfracln NN)$.
For the online setting where the malicious expert can adaptively make its decisions, we show that the optimal online policy can be efficiently computed by solving a dynamic program in $O(N3)$.
arXiv Detail & Related papers (2020-01-02T18:04:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.