Related papers: Effective Dimension in Bandit Problems under Censorship

Effective Dimension in Bandit Problems under Censorship

URL: http://arxiv.org/abs/2302.06916v1
Date: Tue, 14 Feb 2023 09:03:35 GMT
Title: Effective Dimension in Bandit Problems under Censorship
Authors: Gauthier Guinet, Saurabh Amin, Patrick Jaillet
Abstract summary: We study both multi-armed and contextual bandit problems in censored environments. Our goal is to estimate the performance loss due to censorship in the context of classical algorithms designed for uncensored environments.
Score: 22.269565708490468
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this paper, we study both multi-armed and contextual bandit problems in censored environments. Our goal is to estimate the performance loss due to censorship in the context of classical algorithms designed for uncensored environments. Our main contributions include the introduction of a broad class of censorship models and their analysis in terms of the effective dimension of the problem -- a natural measure of its underlying statistical complexity and main driver of the regret bound. In particular, the effective dimension allows us to maintain the structure of the original problem at first order, while embedding it in a bigger space, and thus naturally leads to results analogous to uncensored settings. Our analysis involves a continuous generalization of the Elliptical Potential Inequality, which we believe is of independent interest. We also discover an interesting property of decision-making under censorship: a transient phase during which initial misspecification of censorship is self-corrected at an extra cost, followed by a stationary phase that reflects the inherent slowdown of learning governed by the effective dimension. Our results are useful for applications of sequential decision-making models where the feedback received depends on strategic uncertainty (e.g., agents' willingness to follow a recommendation) and/or random uncertainty (e.g., loss or delay in arrival of information).

Related papers

The Data-Driven Censored Newsvendor Problem [0.552480439325792]
We study a censored variant of the data-driven newsvendor problem, where the decision-maker must select an ordering quantity that minimizes expected overage and underage costs. Our goal is to understand how the degree of historical demand censoring affects the performance of any learning algorithm for this problem. We propose a natural robust algorithm that adapts to the historical level of demand censoring.
arXiv Detail & Related papers (2024-12-02T17:58:54Z)
Criticality and Safety Margins for Reinforcement Learning [53.10194953873209]
We seek to define a criticality framework with both a quantifiable ground truth and a clear significance to users. We introduce true criticality as the expected drop in reward when an agent deviates from its policy for n consecutive random actions. We also introduce the concept of proxy criticality, a low-overhead metric that has a statistically monotonic relationship to true criticality.
arXiv Detail & Related papers (2024-09-26T21:00:45Z)
Generalization Error Bounds for Learning under Censored Feedback [15.367801388932145]
Generalization error bounds from learning theory provide statistical guarantees on how well an algorithm will perform on previously unseen data. We characterize the impacts of data non-IIDness due to censored feedback on such bounds. We show that existing generalization error bounds fail to correctly capture the model's generalization guarantees.
arXiv Detail & Related papers (2024-04-14T13:17:32Z)
dugMatting: Decomposed-Uncertainty-Guided Matting [83.71273621169404]
We propose a decomposed-uncertainty-guided matting algorithm, which explores the explicitly decomposed uncertainties to efficiently and effectively improve the results. The proposed matting framework relieves the requirement for users to determine the interaction areas by using simple and efficient labeling.
arXiv Detail & Related papers (2023-06-02T11:19:50Z)
Algorithmic Censoring in Dynamic Learning Systems [6.2952076725399975]
We formalize censoring, demonstrate how it can arise, and highlight difficulties in detection. We consider safeguards against censoring - recourse and randomized-exploration. The resulting techniques allow examples from censored groups to enter into the training data and correct the model.
arXiv Detail & Related papers (2023-05-15T21:42:22Z)
From Contextual Data to Newsvendor Decisions: On the Actual Performance of Data-Driven Algorithms [2.9603743540540357]
We study how the relevance and quantity of past data affects the performance of a data-driven policy. We consider a setting in which past demands observed under close by'' contexts come from close by distributions.
arXiv Detail & Related papers (2023-02-16T17:03:39Z)
CertainNet: Sampling-free Uncertainty Estimation for Object Detection [65.28989536741658]
Estimating the uncertainty of a neural network plays a fundamental role in safety-critical settings. In this work, we propose a novel sampling-free uncertainty estimation method for object detection. We call it CertainNet, and it is the first to provide separate uncertainties for each output signal: objectness, class, location and size.
arXiv Detail & Related papers (2021-10-04T17:59:31Z)
A fuzzy-rough uncertainty measure to discover bias encoded explicitly or implicitly in features of structured pattern classification datasets [0.0]
We study the existence of bias encoded implicitly in non-protected features as defined by the correlation between protected and unprotected attributes. We conduct a sensitivity analysis to determine the fuzzy operatorsand distance function that best capture change in the boundary regions.
arXiv Detail & Related papers (2021-08-20T10:27:32Z)
Embracing Uncertainty: Decoupling and De-bias for Robust Temporal Grounding [23.571580627202405]
Temporal grounding aims to localize temporal boundaries within untrimmed videos by language queries. It faces the challenge of two types of inevitable human uncertainties: query uncertainty and label uncertainty. We propose a novel DeNet (Decoupling and De-bias) to embrace human uncertainty.
arXiv Detail & Related papers (2021-03-31T07:00:56Z)
Confidence-Budget Matching for Sequential Budgeted Learning [69.77435313099366]
We formalize decision-making problems with querying budget. We consider multi-armed bandits, linear bandits, and reinforcement learning problems. We show that CBM based algorithms perform well in the presence of adversity.
arXiv Detail & Related papers (2021-02-05T19:56:31Z)
Robustness Threats of Differential Privacy [70.818129585404]
We experimentally demonstrate that networks, trained with differential privacy, in some settings might be even more vulnerable in comparison to non-private versions. We study how the main ingredients of differentially private neural networks training, such as gradient clipping and noise addition, affect the robustness of the model.
arXiv Detail & Related papers (2020-12-14T18:59:24Z)
Inverse Active Sensing: Modeling and Understanding Timely Decision-Making [111.07204912245841]
We develop a framework for the general setting of evidence-based decision-making under endogenous, context-dependent time pressure. We demonstrate how it enables modeling intuitive notions of surprise, suspense, and optimality in decision strategies.
arXiv Detail & Related papers (2020-06-25T02:30:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.