Risk level dependent Minimax Quantile lower bounds for Interactive Statistical Decision Making
- URL: http://arxiv.org/abs/2510.05808v1
- Date: Tue, 07 Oct 2025 11:25:13 GMT
- Title: Risk level dependent Minimax Quantile lower bounds for Interactive Statistical Decision Making
- Authors: Raghav Bongole, Amirreza Zamani, Tobias J. Oechtering, Mikael Skoglund,
- Abstract summary: Minimax risk and regret focus on expectation, missing rare failures critical in safety-critical bandits and reinforcement learning.<n>Three strands of prior work motivate this study: minimax-quantile bounds restricted to non-interactive estimation; unified interactive analyses that focus on expected risk rather than risk level specific quantile bounds; and high-probability bandit bounds that still lack a quantile-specific toolkit for general interactive protocols.
- Score: 38.10483249859454
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Minimax risk and regret focus on expectation, missing rare failures critical in safety-critical bandits and reinforcement learning. Minimax quantiles capture these tails. Three strands of prior work motivate this study: minimax-quantile bounds restricted to non-interactive estimation; unified interactive analyses that focus on expected risk rather than risk level specific quantile bounds; and high-probability bandit bounds that still lack a quantile-specific toolkit for general interactive protocols. To close this gap, within the interactive statistical decision making framework, we develop high-probability Fano and Le Cam tools and derive risk level explicit minimax-quantile bounds, including a quantile-to-expectation conversion and a tight link between strict and lower minimax quantiles. Instantiating these results for the two-armed Gaussian bandit immediately recovers optimal-rate bounds.
Related papers
- MultiRisk: Multiple Risk Control via Iterative Score Thresholding [40.193623095603265]
We formalize the problem of enforcing multiple risk constraints with user-defined priorities.<n>We introduce two efficient dynamic programming algorithms that leverage this sequential structure.<n>We show that our algorithm can control each individual risk at close to the target level.
arXiv Detail & Related papers (2025-12-31T03:25:30Z) - Information-Theoretic Minimax Regret Bounds for Reinforcement Learning based on Duality [33.907054045921306]
We study agents acting in an unknown environment where the agent's goal is to find a robust policy.
We consider agents minimizing the maximum regret over different environment parameters, leading to the study of minimax regret.
This research focuses on deriving information-theoretic bounds for minimax regret in Markov Decision Processes.
arXiv Detail & Related papers (2024-10-21T13:45:02Z) - Assouad, Fano, and Le Cam with Interaction: A Unifying Lower Bound Framework and Characterization for Bandit Learnability [71.82666334363174]
We develop a unifying framework for information-theoretic lower bound in statistical estimation and interactive decision making.<n>We propose a new lower bound approach called emphinteractive Fano methodinteractive.
arXiv Detail & Related papers (2024-10-07T15:14:58Z) - Best Arm Identification with Minimal Regret [55.831935724659175]
Best arm identification problem elegantly amalgamates regret minimization and BAI.
Agent's goal is to identify the best arm with a prescribed confidence level.
Double KL-UCB algorithm achieves optimality as the confidence level tends to zero.
arXiv Detail & Related papers (2024-09-27T16:46:02Z) - High-probability minimax lower bounds [2.5993680263955947]
We introduce the notion of a minimax quantile, and seek to articulate its dependence on the quantile level.
We develop high-probability variants of the classical Le Cam and Fano methods, as well as a technique to convert local minimax risk lower bounds to lower bounds on minimax quantiles.
arXiv Detail & Related papers (2024-06-19T11:15:01Z) - Minimax Linear Regression under the Quantile Risk [31.277788690403522]
We study the problem of designing minimax procedures in linear regression under the quantile risk.
We prove a matching upper bound on the worst-case quantile risk of a variant of the recently proposed min-max regression procedure.
arXiv Detail & Related papers (2024-06-17T23:24:14Z) - A Unifying Theory of Thompson Sampling for Continuous Risk-Averse
Bandits [91.3755431537592]
This paper unifies the analysis of risk-averse Thompson sampling algorithms for the multi-armed bandit problem.
Using the contraction principle in the theory of large deviations, we prove novel concentration bounds for continuous risk functionals.
We show that a wide class of risk functionals as well as "nice" functions of them satisfy the continuity condition.
arXiv Detail & Related papers (2021-08-25T17:09:01Z) - Minimax Off-Policy Evaluation for Multi-Armed Bandits [58.7013651350436]
We study the problem of off-policy evaluation in the multi-armed bandit model with bounded rewards.
We develop minimax rate-optimal procedures under three settings.
arXiv Detail & Related papers (2021-01-19T18:55:29Z) - On Lower Bounds for Standard and Robust Gaussian Process Bandit
Optimization [55.937424268654645]
We consider algorithm-independent lower bounds for the problem of black-box optimization of functions having a bounded norm.
We provide a novel proof technique for deriving lower bounds on the regret, with benefits including simplicity, versatility, and an improved dependence on the error probability.
arXiv Detail & Related papers (2020-08-20T03:48:14Z) - Constrained regret minimization for multi-criterion multi-armed bandits [5.349852254138086]
We study the problem of regret minimization over a given time horizon, subject to a risk constraint.
We propose a Risk-Constrained Lower Confidence Bound algorithm that guarantees logarithmic regret.
We prove lower bounds on the performance of any risk-constrained regret minimization algorithm.
arXiv Detail & Related papers (2020-06-17T04:23:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.