High-Probability Risk Bounds via Sequential Predictors
- URL: http://arxiv.org/abs/2308.07588v1
- Date: Tue, 15 Aug 2023 06:19:31 GMT
- Title: High-Probability Risk Bounds via Sequential Predictors
- Authors: Dirk van der Hoeven, Nikita Zhivotovskiy, Nicol\`o Cesa-Bianchi
- Abstract summary: We show that online to batch conversions applied to general online learning algorithms can bypass regret bounds.
We obtain nearly optimal high-probability risk bounds for several classical statistical estimation problems.
Our analysis relies on the fact that many online learning algorithms are improper, as they are not restricted to use predictors from a given reference class.
- Score: 20.741036493022442
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Online learning methods yield sequential regret bounds under minimal
assumptions and provide in-expectation risk bounds for statistical learning.
However, despite the apparent advantage of online guarantees over their
statistical counterparts, recent findings indicate that in many important
cases, regret bounds may not guarantee tight high-probability risk bounds in
the statistical setting. In this work we show that online to batch conversions
applied to general online learning algorithms can bypass this limitation. Via a
general second-order correction to the loss function defining the regret, we
obtain nearly optimal high-probability risk bounds for several classical
statistical estimation problems, such as discrete distribution estimation,
linear regression, logistic regression, and conditional density estimation. Our
analysis relies on the fact that many online learning algorithms are improper,
as they are not restricted to use predictors from a given reference class. The
improper nature of our estimators enables significant improvements in the
dependencies on various problem parameters. Finally, we discuss some
computational advantages of our sequential algorithms over their existing batch
counterparts.
Related papers
- Statistical Inference for Temporal Difference Learning with Linear Function Approximation [62.69448336714418]
Temporal Difference (TD) learning, arguably the most widely used for policy evaluation, serves as a natural framework for this purpose.
In this paper, we study the consistency properties of TD learning with Polyak-Ruppert averaging and linear function approximation, and obtain three significant improvements over existing results.
arXiv Detail & Related papers (2024-10-21T15:34:44Z) - Model-Based Epistemic Variance of Values for Risk-Aware Policy Optimization [59.758009422067]
We consider the problem of quantifying uncertainty over expected cumulative rewards in model-based reinforcement learning.
We propose a new uncertainty Bellman equation (UBE) whose solution converges to the true posterior variance over values.
We introduce a general-purpose policy optimization algorithm, Q-Uncertainty Soft Actor-Critic (QU-SAC) that can be applied for either risk-seeking or risk-averse policy optimization.
arXiv Detail & Related papers (2023-12-07T15:55:58Z) - Likelihood Ratio Confidence Sets for Sequential Decision Making [51.66638486226482]
We revisit the likelihood-based inference principle and propose to use likelihood ratios to construct valid confidence sequences.
Our method is especially suitable for problems with well-specified likelihoods.
We show how to provably choose the best sequence of estimators and shed light on connections to online convex optimization.
arXiv Detail & Related papers (2023-11-08T00:10:21Z) - Distribution-free risk assessment of regression-based machine learning
algorithms [6.507711025292814]
We focus on regression algorithms and the risk-assessment task of computing the probability of the true label lying inside an interval defined around the model's prediction.
We solve the risk-assessment problem using the conformal prediction approach, which provides prediction intervals that are guaranteed to contain the true label with a given probability.
arXiv Detail & Related papers (2023-10-05T13:57:24Z) - Online Constraint Tightening in Stochastic Model Predictive Control: A
Regression Approach [49.056933332667114]
No analytical solutions exist for chance-constrained optimal control problems.
We propose a data-driven approach for learning the constraint-tightening parameters online during control.
Our approach yields constraint-tightening parameters that tightly satisfy the chance constraints.
arXiv Detail & Related papers (2023-10-04T16:22:02Z) - Minimum-Delay Adaptation in Non-Stationary Reinforcement Learning via
Online High-Confidence Change-Point Detection [7.685002911021767]
We introduce an algorithm that efficiently learns policies in non-stationary environments.
It analyzes a possibly infinite stream of data and computes, in real-time, high-confidence change-point detection statistics.
We show that (i) this algorithm minimizes the delay until unforeseen changes to a context are detected, thereby allowing for rapid responses.
arXiv Detail & Related papers (2021-05-20T01:57:52Z) - Practical and Rigorous Uncertainty Bounds for Gaussian Process
Regression [10.33782982051778]
We introduce new uncertainty bounds that are rigorous, yet practically useful at the same time.
In particular, the bounds can be explicitly evaluated and are much less conservative than state of the art results.
We demonstrate these advantages and the usefulness of our results for learning-based control with numerical examples.
arXiv Detail & Related papers (2021-05-06T16:41:04Z) - Learning Probabilistic Ordinal Embeddings for Uncertainty-Aware
Regression [91.3373131262391]
Uncertainty is the only certainty there is.
Traditionally, the direct regression formulation is considered and the uncertainty is modeled by modifying the output space to a certain family of probabilistic distributions.
How to model the uncertainty within the present-day technologies for regression remains an open issue.
arXiv Detail & Related papers (2021-03-25T06:56:09Z) - Nonconvex sparse regularization for deep neural networks and its
optimality [1.9798034349981162]
Deep neural network (DNN) estimators can attain optimal convergence rates for regression and classification problems.
We propose a novel penalized estimation method for sparse DNNs.
We prove that the sparse-penalized estimator can adaptively attain minimax convergence rates for various nonparametric regression problems.
arXiv Detail & Related papers (2020-03-26T07:15:28Z) - Orthogonal Statistical Learning [49.55515683387805]
We provide non-asymptotic excess risk guarantees for statistical learning in a setting where the population risk depends on an unknown nuisance parameter.
We show that if the population risk satisfies a condition called Neymanity, the impact of the nuisance estimation error on the excess risk bound achieved by the meta-algorithm is of second order.
arXiv Detail & Related papers (2019-01-25T02:21:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.