Likelihood Ratio Confidence Sets for Sequential Decision Making
- URL: http://arxiv.org/abs/2311.04402v1
- Date: Wed, 8 Nov 2023 00:10:21 GMT
- Title: Likelihood Ratio Confidence Sets for Sequential Decision Making
- Authors: Nicolas Emmenegger, Mojm\'ir Mutn\'y, Andreas Krause
- Abstract summary: We revisit the likelihood-based inference principle and propose to use likelihood ratios to construct valid confidence sequences.
Our method is especially suitable for problems with well-specified likelihoods.
We show how to provably choose the best sequence of estimators and shed light on connections to online convex optimization.
- Score: 51.66638486226482
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Certifiable, adaptive uncertainty estimates for unknown quantities are an
essential ingredient of sequential decision-making algorithms. Standard
approaches rely on problem-dependent concentration results and are limited to a
specific combination of parameterization, noise family, and estimator. In this
paper, we revisit the likelihood-based inference principle and propose to use
likelihood ratios to construct any-time valid confidence sequences without
requiring specialized treatment in each application scenario. Our method is
especially suitable for problems with well-specified likelihoods, and the
resulting sets always maintain the prescribed coverage in a model-agnostic
manner. The size of the sets depends on a choice of estimator sequence in the
likelihood ratio. We discuss how to provably choose the best sequence of
estimators and shed light on connections to online convex optimization with
algorithms such as Follow-the-Regularized-Leader. To counteract the initially
large bias of the estimators, we propose a reweighting scheme that also opens
up deployment in non-parametric settings such as RKHS function classes. We
provide a non-asymptotic analysis of the likelihood ratio confidence sets size
for generalized linear models, using insights from convex duality and online
learning. We showcase the practical strength of our method on generalized
linear bandit problems, survival analysis, and bandits with various additive
noise distributions.
Related papers
- From Conformal Predictions to Confidence Regions [1.4272411349249627]
We introduce CCR, which employs a combination of conformal prediction intervals for the model outputs to establish confidence regions for model parameters.
We present coverage guarantees under minimal assumptions on noise and that is valid in finite sample regime.
Our approach is applicable to both split conformal predictions and black-box methodologies including full or cross-conformal approaches.
arXiv Detail & Related papers (2024-05-28T21:33:12Z) - Optimal Baseline Corrections for Off-Policy Contextual Bandits [61.740094604552475]
We aim to learn decision policies that optimize an unbiased offline estimate of an online reward metric.
We propose a single framework built on their equivalence in learning scenarios.
Our framework enables us to characterize the variance-optimal unbiased estimator and provide a closed-form solution for it.
arXiv Detail & Related papers (2024-05-09T12:52:22Z) - Finite Sample Confidence Regions for Linear Regression Parameters Using
Arbitrary Predictors [1.6860963320038902]
We explore a novel methodology for constructing confidence regions for parameters of linear models, using predictions from any arbitrary predictor.
The derived confidence regions can be cast as constraints within a Mixed Linear Programming framework, enabling optimisation of linear objectives.
Unlike previous methods, the confidence region can be empty, which can be used for hypothesis testing.
arXiv Detail & Related papers (2024-01-27T00:15:48Z) - Model-Based Epistemic Variance of Values for Risk-Aware Policy Optimization [59.758009422067]
We consider the problem of quantifying uncertainty over expected cumulative rewards in model-based reinforcement learning.
We propose a new uncertainty Bellman equation (UBE) whose solution converges to the true posterior variance over values.
We introduce a general-purpose policy optimization algorithm, Q-Uncertainty Soft Actor-Critic (QU-SAC) that can be applied for either risk-seeking or risk-averse policy optimization.
arXiv Detail & Related papers (2023-12-07T15:55:58Z) - Calibrating Neural Simulation-Based Inference with Differentiable
Coverage Probability [50.44439018155837]
We propose to include a calibration term directly into the training objective of the neural model.
By introducing a relaxation of the classical formulation of calibration error we enable end-to-end backpropagation.
It is directly applicable to existing computational pipelines allowing reliable black-box posterior inference.
arXiv Detail & Related papers (2023-10-20T10:20:45Z) - Margin theory for the scenario-based approach to robust optimization in
high dimension [0.0]
This paper deals with the scenario approach to robust optimization.
This relies on a random sampling of the possibly infinite number of constraints induced by uncertainties in a problem.
arXiv Detail & Related papers (2023-03-07T13:33:46Z) - Integrated Conditional Estimation-Optimization [6.037383467521294]
Many real-world optimization problems uncertain parameters with probability can be estimated using contextual feature information.
In contrast to the standard approach of estimating the distribution of uncertain parameters, we propose an integrated conditional estimation approach.
We show that our ICEO approach is theally consistent under moderate conditions.
arXiv Detail & Related papers (2021-10-24T04:49:35Z) - High Probability Complexity Bounds for Non-Smooth Stochastic Optimization with Heavy-Tailed Noise [51.31435087414348]
It is essential to theoretically guarantee that algorithms provide small objective residual with high probability.
Existing methods for non-smooth convex optimization have complexity bounds with dependence on confidence level.
We propose novel stepsize rules for two methods with gradient clipping.
arXiv Detail & Related papers (2021-06-10T17:54:21Z) - Amortized Conditional Normalized Maximum Likelihood: Reliable Out of
Distribution Uncertainty Estimation [99.92568326314667]
We propose the amortized conditional normalized maximum likelihood (ACNML) method as a scalable general-purpose approach for uncertainty estimation.
Our algorithm builds on the conditional normalized maximum likelihood (CNML) coding scheme, which has minimax optimal properties according to the minimum description length principle.
We demonstrate that ACNML compares favorably to a number of prior techniques for uncertainty estimation in terms of calibration on out-of-distribution inputs.
arXiv Detail & Related papers (2020-11-05T08:04:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.