Ranked Prioritization of Groups in Combinatorial Bandit Allocation
- URL: http://arxiv.org/abs/2205.05659v1
- Date: Wed, 11 May 2022 17:40:29 GMT
- Title: Ranked Prioritization of Groups in Combinatorial Bandit Allocation
- Authors: Lily Xu, Arpita Biswas, Fei Fang, Milind Tambe
- Abstract summary: We propose a novel bandit objective that trades off between reward over species.
We show this objective can be expressed as a weighted linear sum of Lipschitz-continuous reward functions.
- Score: 62.24280332575472
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Preventing poaching through ranger patrols protects endangered wildlife,
directly contributing to the UN Sustainable Development Goal 15 of life on
land. Combinatorial bandits have been used to allocate limited patrol
resources, but existing approaches overlook the fact that each location is home
to multiple species in varying proportions, so a patrol benefits each species
to differing degrees. When some species are more vulnerable, we ought to offer
more protection to these animals; unfortunately, existing combinatorial bandit
approaches do not offer a way to prioritize important species. To bridge this
gap, (1) We propose a novel combinatorial bandit objective that trades off
between reward maximization and also accounts for prioritization over species,
which we call ranked prioritization. We show this objective can be expressed as
a weighted linear sum of Lipschitz-continuous reward functions. (2) We provide
RankedCUCB, an algorithm to select combinatorial actions that optimize our
prioritization-based objective, and prove that it achieves asymptotic
no-regret. (3) We demonstrate empirically that RankedCUCB leads to up to 38%
improvement in outcomes for endangered species using real-world wildlife
conservation data. Along with adapting to other challenges such as preventing
illegal logging and overfishing, our no-regret algorithm addresses the general
combinatorial bandit problem with a weighted linear objective.
Related papers
- Causal Bandits: The Pareto Optimal Frontier of Adaptivity, a Reduction to Linear Bandits, and Limitations around Unknown Marginals [28.94461817548213]
We prove upper and matching lower bounds on the possible trade-offs in the performance of learning in conditionally benign and arbitrary environments.
We are the first to obtain instance-dependent bounds for causal bandits, by reducing the problem to the linear bandit setting.
arXiv Detail & Related papers (2024-07-01T04:12:15Z) - Robust Lipschitz Bandits to Adversarial Corruptions [61.85150061213987]
Lipschitz bandit is a variant of bandits that deals with a continuous arm set defined on a metric space.
In this paper, we introduce a new problem of Lipschitz bandits in the presence of adversarial corruptions.
Our work presents the first line of robust Lipschitz bandit algorithms that can achieve sub-linear regret under both types of adversary.
arXiv Detail & Related papers (2023-05-29T18:16:59Z) - Versatile Dueling Bandits: Best-of-both-World Analyses for Online
Learning from Preferences [28.79598714109439]
We study the problem of $K$-armed dueling bandit for both and adversarial environments.
We first propose a novel reduction from any (general) dueling bandits to multi-armed bandits.
Our algorithm is also the first to achieve an optimal $O(sum_i = 1K fraclog TDelta_i)$ regret bound against the Condorcet-winner benchmark.
arXiv Detail & Related papers (2022-02-14T13:37:23Z) - Bias-Robust Bayesian Optimization via Dueling Bandit [57.82422045437126]
We consider Bayesian optimization in settings where observations can be adversarially biased.
We propose a novel approach for dueling bandits based on information-directed sampling (IDS)
Thereby, we obtain the first efficient kernelized algorithm for dueling bandits that comes with cumulative regret guarantees.
arXiv Detail & Related papers (2021-05-25T10:08:41Z) - Expanding boundaries of Gap Safe screening [0.0]
A powerful strategy to boost the performance of algorithms is known as safe screening.
We extend the existing Gap Safe screening framework by relaxing the global strong-concavity assumption on the dual cost function.
The proposed general framework is exemplified by some notable particular cases: logistic function, beta = 1.5 and Kullback-Leibler divergences.
arXiv Detail & Related papers (2021-02-22T09:23:31Z) - Dual-Mandate Patrols: Multi-Armed Bandits for Green Security [67.29846393678808]
Conservation efforts in green security domains to protect wildlife and forests are constrained by the limited availability of defenders.
We formulate the problem as a multi-armed bandit, where each action represents a patrol strategy.
We show that our algorithm, LIZARD, improves performance on real-world poaching data from Cambodia.
arXiv Detail & Related papers (2020-09-14T16:40:44Z) - Exploration in two-stage recommender systems [79.50534282841618]
Two-stage recommender systems are widely adopted in industry due to their scalability and maintainability.
A key challenge of this setup is that optimal performance of each stage in isolation does not imply optimal global performance.
We propose a method of synchronising the exploration strategies between the ranker and the nominators.
arXiv Detail & Related papers (2020-09-01T16:52:51Z) - Adaptive Discretization for Adversarial Lipschitz Bandits [85.39106976861702]
Lipschitz bandits is a prominent version of multi-armed bandits that studies large, structured action spaces.
A central theme here is the adaptive discretization of the action space, which gradually zooms in'' on the more promising regions.
We provide the first algorithm for adaptive discretization in the adversarial version, and derive instance-dependent regret bounds.
arXiv Detail & Related papers (2020-06-22T16:06:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.