Fairness for Workers Who Pull the Arms: An Index Based Policy for
Allocation of Restless Bandit Tasks
- URL: http://arxiv.org/abs/2303.00799v1
- Date: Wed, 1 Mar 2023 19:59:42 GMT
- Title: Fairness for Workers Who Pull the Arms: An Index Based Policy for
Allocation of Restless Bandit Tasks
- Authors: Arpita Biswas, Jackson A. Killian, Paula Rodriguez Diaz, Susobhan
Ghosh, Milind Tambe
- Abstract summary: We consider a novel RMAB setting, called multi-worker restless bandits (MWRMAB) with heterogeneous workers.
The goal is to plan an intervention schedule that maximizes the expected reward while satisfying budget constraints on each worker.
Our contributions are two-fold: (1) we provide a multi-worker extension of the Whittle index to tackle heterogeneous costs and per-worker budget and (2) we develop an index-based scheduling policy to achieve fairness.
- Score: 30.323831598899773
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Motivated by applications such as machine repair, project monitoring, and
anti-poaching patrol scheduling, we study intervention planning of stochastic
processes under resource constraints. This planning problem has previously been
modeled as restless multi-armed bandits (RMAB), where each arm is an
intervention-dependent Markov Decision Process. However, the existing
literature assumes all intervention resources belong to a single uniform pool,
limiting their applicability to real-world settings where interventions are
carried out by a set of workers, each with their own costs, budgets, and
intervention effects. In this work, we consider a novel RMAB setting, called
multi-worker restless bandits (MWRMAB) with heterogeneous workers. The goal is
to plan an intervention schedule that maximizes the expected reward while
satisfying budget constraints on each worker as well as fairness in terms of
the load assigned to each worker. Our contributions are two-fold: (1) we
provide a multi-worker extension of the Whittle index to tackle heterogeneous
costs and per-worker budget and (2) we develop an index-based scheduling policy
to achieve fairness. Further, we evaluate our method on various cost structures
and show that our method significantly outperforms other baselines in terms of
fairness without sacrificing much in reward accumulated.
Related papers
- Survival Multiarmed Bandits with Bootstrapping Methods [0.0]
The Survival Multiarmed Bandits (S-MAB) problem is an extension which constrains an agent to a budget related to observed rewards.
This paper presents a framework that addresses such a dual goal using an objective function balanced by a ruin aversion component.
arXiv Detail & Related papers (2024-10-21T20:21:10Z) - Stochastic Bandits for Egalitarian Assignment [58.33714486693828]
We study EgalMAB, an egalitarian assignment problem in the context of multi-armed bandits.
We design and analyze a UCB-based policy EgalUCB and establish upper bounds on the cumulative regret.
arXiv Detail & Related papers (2024-10-08T09:49:47Z) - Active Fine-Tuning of Generalist Policies [54.65568433408307]
We propose AMF (Active Multi-task Fine-tuning) to maximize multi-task policy performance under a limited demonstration budget.
We derive performance guarantees for AMF under regularity assumptions and demonstrate its empirical effectiveness in complex and high-dimensional environments.
arXiv Detail & Related papers (2024-10-07T13:26:36Z) - Principal-Agent Reward Shaping in MDPs [50.914110302917756]
Principal-agent problems arise when one party acts on behalf of another, leading to conflicts of interest.
We study a two-player Stack game where the principal and the agent have different reward functions, and the agent chooses an MDP policy for both players.
Our results establish trees and deterministic decision processes with a finite horizon.
arXiv Detail & Related papers (2023-12-30T18:30:44Z) - Auction-Based Scheduling [2.3326951882644553]
Auction-based scheduling is a modular framework for multi-objective decision-making problems.
Each objective is fulfilled using a separate policy, and the policies can be independently created, modified, and replaced.
We present decentralized algorithms to synthesize a pair of policies, their initially allocated budgets, and bidding strategies.
arXiv Detail & Related papers (2023-10-18T08:38:42Z) - Limited Resource Allocation in a Non-Markovian World: The Case of
Maternal and Child Healthcare [27.812174610119452]
We consider the problem of scheduling interventions in low resource settings to increase adherence and/or engagement.
Past works have successfully developed several classes of Restless Multi-armed Bandit (RMAB) based solutions for this problem.
We demonstrate significant deviations from the Markov assumption on real-world data on a maternal health awareness program from our partner NGO, ARMMAN.
To tackle the generalised non-Markovian RMAB setting we (i) model each participant's trajectory as a time-series, (ii) leverage the power of time-series forecasting models to predict future states, and (iii) propose the Time
arXiv Detail & Related papers (2023-05-22T02:26:29Z) - Planning Multiple Epidemic Interventions with Reinforcement Learning [7.51289645756884]
An optimal plan will curb an epidemic with minimal loss of life, disease burden, and economic cost.
Finding an optimal plan is an intractable computational problem in realistic settings.
We apply state-of-the-art actor-critic reinforcement learning algorithms to search for plans that minimize overall costs.
arXiv Detail & Related papers (2023-01-30T11:51:24Z) - Multi-Agent Neural Rewriter for Vehicle Routing with Limited Disclosure
of Costs [65.23158435596518]
Solving the multi-vehicle routing problem as a team Markov game with partially observable costs.
Our multi-agent reinforcement learning approach, the so-called multi-agent Neural Rewriter, builds on the single-agent Neural Rewriter to solve the problem by iteratively rewriting solutions.
arXiv Detail & Related papers (2022-06-13T09:17:40Z) - Dynamic Multi-Robot Task Allocation under Uncertainty and Temporal
Constraints [52.58352707495122]
We present a multi-robot allocation algorithm that decouples the key computational challenges of sequential decision-making under uncertainty and multi-agent coordination.
We validate our results over a wide range of simulations on two distinct domains: multi-arm conveyor belt pick-and-place and multi-drone delivery dispatch in a city.
arXiv Detail & Related papers (2020-05-27T01:10:41Z) - VCG Mechanism Design with Unknown Agent Values under Stochastic Bandit
Feedback [104.06766271716774]
We study a multi-round welfare-maximising mechanism design problem in instances where agents do not know their values.
We first define three notions of regret for the welfare, the individual utilities of each agent and that of the mechanism.
Our framework also provides flexibility to control the pricing scheme so as to trade-off between the agent and seller regrets.
arXiv Detail & Related papers (2020-04-19T18:00:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.