Related papers: Adaptive Combinatorial Allocation

Adaptive Combinatorial Allocation

URL: http://arxiv.org/abs/2011.02330v1
Date: Wed, 4 Nov 2020 15:02:59 GMT
Title: Adaptive Combinatorial Allocation
Authors: Maximilian Kasy and Alexander Teytelboym
Abstract summary: We consider settings where an allocation has to be chosen repeatedly, returns are unknown but can be learned, and decisions are subject to constraints. Our model covers two-sided and one-sided matching, even with complex constraints.
Score: 77.86290991564829
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: We consider settings where an allocation has to be chosen repeatedly, returns are unknown but can be learned, and decisions are subject to constraints. Our model covers two-sided and one-sided matching, even with complex constraints. We propose an approach based on Thompson sampling. Our main result is a prior-independent finite-sample bound on the expected regret for this algorithm. Although the number of allocations grows exponentially in the number of participants, the bound does not depend on this number. We illustrate the performance of our algorithm using data on refugee resettlement in the United States.

Related papers

Quantile Multi-Armed Bandits with 1-bit Feedback [40.22079522118723]
We study a variant of best-arm identification involving elements of risk sensitivity and communication constraints. We propose an algorithm that utilizes noisy binary search as a subroutine, allowing the learner to estimate quantile rewards through 1-bit feedback.
arXiv Detail & Related papers (2025-02-10T17:03:33Z)
Rising Rested Bandits: Lower Bounds and Efficient Algorithms [15.390680055166769]
This paper is in the field of sequential Multi-Armed Bandits (MABs)<n>We study a particular case of the rested bandits in which the arms' expected reward is monotonically non-decreasing and concave.<n>We empirically compare our algorithms with state-of-the-art methods for non-stationary MABs over several synthetically generated tasks and an online model selection problem for a real-world dataset.
arXiv Detail & Related papers (2024-11-06T22:00:46Z)
Sample size planning for conditional counterfactual mean estimation with a K-armed randomized experiment [0.0]
We show how to determine a sufficiently large sample size for a $K$-armed randomized experiment. Using policy trees to learn sub-groups, we evaluate our nominal guarantees on a large publicly-available randomized experiment test data set.
arXiv Detail & Related papers (2024-03-06T20:37:29Z)
Bayesian decision-making under misspecified priors with applications to meta-learning [64.38020203019013]
Thompson sampling and other sequential decision-making algorithms are popular approaches to tackle explore/exploit trade-offs in contextual bandits. We show that performance degrades gracefully with misspecified priors.
arXiv Detail & Related papers (2021-07-03T23:17:26Z)
Root-finding Approaches for Computing Conformal Prediction Set [18.405645120971496]
Conformal prediction constructs a confidence region for an unobserved response of a feature vector based on previous identically distributed and exchangeable observations. We exploit the fact that, emphoften, conformal prediction sets are intervals whose boundaries can be efficiently approximated by classical root-finding software.
arXiv Detail & Related papers (2021-04-14T06:41:12Z)
Batched Neural Bandits [107.5072688105936]
BatchNeuralUCB combines neural networks with optimism to address the exploration-exploitation tradeoff. We prove that BatchNeuralUCB achieves the same regret as the fully sequential version while reducing the number of policy updates considerably.
arXiv Detail & Related papers (2021-02-25T17:36:44Z)
Regret Bound Balancing and Elimination for Model Selection in Bandits and RL [34.15978290083909]
We propose a simple model selection approach for algorithms in bandit and reinforcement learning problems. We prove that the total regret of this approach is bounded by the best valid candidate regret bound times a multiplicative factor. Unlike recent efforts in model selection for linear bandits, our approach is versatile enough to also cover cases where the context information is generated by an adversarial environment.
arXiv Detail & Related papers (2020-12-24T00:53:42Z)
Quantile Multi-Armed Bandits: Optimal Best-Arm Identification and a Differentially Private Scheme [16.1694012177079]
We study the best-arm identification problem in multi-armed bandits with, potentially private rewards. The goal is to identify the arm with the highest quantile at a fixed, prescribed level. We show that our algorithm is $delta$-PAC and we characterize its sample complexity.
arXiv Detail & Related papers (2020-06-11T20:23:43Z)
The Price of Incentivizing Exploration: A Characterization via Thompson Sampling and Sample Complexity [83.81297078039836]
We consider incentivized exploration: a version of multi-armed bandits where the choice of arms is controlled by self-interested agents. We focus on the price of incentives: the loss in performance, broadly construed, incurred for the sake of incentive-compatibility.
arXiv Detail & Related papers (2020-02-03T04:58:51Z)
Optimal Clustering from Noisy Binary Feedback [75.17453757892152]
We study the problem of clustering a set of items from binary user feedback. We devise an algorithm with a minimal cluster recovery error rate. For adaptive selection, we develop an algorithm inspired by the derivation of the information-theoretical error lower bounds.
arXiv Detail & Related papers (2019-10-14T09:18:26Z)
The Simulator: Understanding Adaptive Sampling in the Moderate-Confidence Regime [52.38455827779212]
We propose a novel technique for analyzing adaptive sampling called the em Simulator. We prove the first instance-based lower bounds the top-k problem which incorporate the appropriate log-factors. Our new analysis inspires a simple and near-optimal for the best-arm and top-k identification, the first em practical of its kind for the latter problem.
arXiv Detail & Related papers (2017-02-16T23:42:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.