Related papers: Risk level dependent Minimax Quantile lower bounds for Interactive Statistical Decision Making

Risk level dependent Minimax Quantile lower bounds for Interactive Statistical Decision Making

URL: http://arxiv.org/abs/2510.05808v1
Date: Tue, 07 Oct 2025 11:25:13 GMT
Title: Risk level dependent Minimax Quantile lower bounds for Interactive Statistical Decision Making
Authors: Raghav Bongole, Amirreza Zamani, Tobias J. Oechtering, Mikael Skoglund,
Abstract summary: Minimax risk and regret focus on expectation, missing rare failures critical in safety-critical bandits and reinforcement learning.<n>Three strands of prior work motivate this study: minimax-quantile bounds restricted to non-interactive estimation; unified interactive analyses that focus on expected risk rather than risk level specific quantile bounds; and high-probability bandit bounds that still lack a quantile-specific toolkit for general interactive protocols.
Score: 38.10483249859454
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Minimax risk and regret focus on expectation, missing rare failures critical in safety-critical bandits and reinforcement learning. Minimax quantiles capture these tails. Three strands of prior work motivate this study: minimax-quantile bounds restricted to non-interactive estimation; unified interactive analyses that focus on expected risk rather than risk level specific quantile bounds; and high-probability bandit bounds that still lack a quantile-specific toolkit for general interactive protocols. To close this gap, within the interactive statistical decision making framework, we develop high-probability Fano and Le Cam tools and derive risk level explicit minimax-quantile bounds, including a quantile-to-expectation conversion and a tight link between strict and lower minimax quantiles. Instantiating these results for the two-armed Gaussian bandit immediately recovers optimal-rate bounds.

Related papers

MultiRisk: Multiple Risk Control via Iterative Score Thresholding [40.193623095603265]
We formalize the problem of enforcing multiple risk constraints with user-defined priorities.<n>We introduce two efficient dynamic programming algorithms that leverage this sequential structure.<n>We show that our algorithm can control each individual risk at close to the target level.
arXiv Detail & Related papers (2025-12-31T03:25:30Z)
Information-Theoretic Minimax Regret Bounds for Reinforcement Learning based on Duality [33.907054045921306]
We study agents acting in an unknown environment where the agent's goal is to find a robust policy. We consider agents minimizing the maximum regret over different environment parameters, leading to the study of minimax regret. This research focuses on deriving information-theoretic bounds for minimax regret in Markov Decision Processes.
arXiv Detail & Related papers (2024-10-21T13:45:02Z)
Assouad, Fano, and Le Cam with Interaction: A Unifying Lower Bound Framework and Characterization for Bandit Learnability [71.82666334363174]
We develop a unifying framework for information-theoretic lower bound in statistical estimation and interactive decision making.<n>We propose a new lower bound approach called emphinteractive Fano methodinteractive.
arXiv Detail & Related papers (2024-10-07T15:14:58Z)
Best Arm Identification with Minimal Regret [55.831935724659175]
Best arm identification problem elegantly amalgamates regret minimization and BAI. Agent's goal is to identify the best arm with a prescribed confidence level. Double KL-UCB algorithm achieves optimality as the confidence level tends to zero.
arXiv Detail & Related papers (2024-09-27T16:46:02Z)
High-probability minimax lower bounds [2.5993680263955947]
We introduce the notion of a minimax quantile, and seek to articulate its dependence on the quantile level. We develop high-probability variants of the classical Le Cam and Fano methods, as well as a technique to convert local minimax risk lower bounds to lower bounds on minimax quantiles.
arXiv Detail & Related papers (2024-06-19T11:15:01Z)
Minimax Linear Regression under the Quantile Risk [31.277788690403522]
We study the problem of designing minimax procedures in linear regression under the quantile risk. We prove a matching upper bound on the worst-case quantile risk of a variant of the recently proposed min-max regression procedure.
arXiv Detail & Related papers (2024-06-17T23:24:14Z)
A Unifying Theory of Thompson Sampling for Continuous Risk-Averse Bandits [91.3755431537592]
This paper unifies the analysis of risk-averse Thompson sampling algorithms for the multi-armed bandit problem. Using the contraction principle in the theory of large deviations, we prove novel concentration bounds for continuous risk functionals. We show that a wide class of risk functionals as well as "nice" functions of them satisfy the continuity condition.
arXiv Detail & Related papers (2021-08-25T17:09:01Z)
Minimax Off-Policy Evaluation for Multi-Armed Bandits [58.7013651350436]
We study the problem of off-policy evaluation in the multi-armed bandit model with bounded rewards. We develop minimax rate-optimal procedures under three settings.
arXiv Detail & Related papers (2021-01-19T18:55:29Z)
On Lower Bounds for Standard and Robust Gaussian Process Bandit Optimization [55.937424268654645]
We consider algorithm-independent lower bounds for the problem of black-box optimization of functions having a bounded norm. We provide a novel proof technique for deriving lower bounds on the regret, with benefits including simplicity, versatility, and an improved dependence on the error probability.
arXiv Detail & Related papers (2020-08-20T03:48:14Z)
Constrained regret minimization for multi-criterion multi-armed bandits [5.349852254138086]
We study the problem of regret minimization over a given time horizon, subject to a risk constraint. We propose a Risk-Constrained Lower Confidence Bound algorithm that guarantees logarithmic regret. We prove lower bounds on the performance of any risk-constrained regret minimization algorithm.
arXiv Detail & Related papers (2020-06-17T04:23:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.