Towards Soft Fairness in Restless Multi-Armed Bandits
- URL: http://arxiv.org/abs/2207.13343v1
- Date: Wed, 27 Jul 2022 07:56:32 GMT
- Title: Towards Soft Fairness in Restless Multi-Armed Bandits
- Authors: Dexun Li, Pradeep Varakantham
- Abstract summary: Restless multi-armed bandits (RMAB) is a framework for allocating limited resources under uncertainty.
To avoid starvation in the executed interventions across individuals/regions/communities, we first provide a soft fairness constraint.
We then provide an approach to enforce the soft fairness constraint in RMABs.
- Score: 8.140037969280716
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Restless multi-armed bandits (RMAB) is a framework for allocating limited
resources under uncertainty. It is an extremely useful model for monitoring
beneficiaries and executing timely interventions to ensure maximum benefit in
public health settings (e.g., ensuring patients take medicines in tuberculosis
settings, ensuring pregnant mothers listen to automated calls about good
pregnancy practices). Due to the limited resources, typically certain
communities or regions are starved of interventions that can have follow-on
effects. To avoid starvation in the executed interventions across
individuals/regions/communities, we first provide a soft fairness constraint
and then provide an approach to enforce the soft fairness constraint in RMABs.
The soft fairness constraint requires that an algorithm never probabilistically
favor one arm over another if the long-term cumulative reward of choosing the
latter arm is higher. Our approach incorporates softmax based value iteration
method in the RMAB setting to design selection algorithms that manage to
satisfy the proposed fairness constraint. Our method, referred to as SoftFair,
also provides theoretical performance guarantees and is asymptotically optimal.
Finally, we demonstrate the utility of our approaches on simulated benchmarks
and show that the soft fairness constraint can be handled without a significant
sacrifice on value.
Related papers
- Optimal Multi-Fidelity Best-Arm Identification [65.23078799972188]
In bandit best-arm identification, an algorithm is tasked with finding the arm with highest mean reward with a specified accuracy as fast as possible.
We study multi-fidelity best-arm identification, in which the can choose to sample an arm at a lower fidelity (less accurate mean estimate) for a lower cost.
Several methods have been proposed for tackling this problem, but their optimality remain elusive, notably due to loose lower bounds on the total cost needed to identify the best arm.
arXiv Detail & Related papers (2024-06-05T08:02:40Z) - Best Arm Identification with Fixed Budget: A Large Deviation Perspective [54.305323903582845]
We present sred, a truly adaptive algorithm that can reject arms in it any round based on the observed empirical gaps between the rewards of various arms.
In particular, we present sred, a truly adaptive algorithm that can reject arms in it any round based on the observed empirical gaps between the rewards of various arms.
arXiv Detail & Related papers (2023-12-19T13:17:43Z) - f-FERM: A Scalable Framework for Robust Fair Empirical Risk Minimization [9.591164070876689]
This paper presents a unified optimization framework for fair empirical risk based on f-divergence measures (f-FERM)
In addition, our experiments demonstrate the superiority of fairness-accuracy tradeoffs offered by f-FERM for almost all batch sizes.
Our extension is based on a distributionally robust optimization reformulation of f-FERM objective under $L_p$ norms as uncertainty sets.
arXiv Detail & Related papers (2023-12-06T03:14:16Z) - Optimal and Fair Encouragement Policy Evaluation and Learning [11.712023983596914]
We study causal identification, statistical variance-reduced estimation, and robust estimation of optimal treatment rules.
We develop a two-stage algorithm for solving over parametrized policy classes under general constraints to obtain variance-sensitive regret bounds.
arXiv Detail & Related papers (2023-09-12T20:45:30Z) - Implicitly normalized forecaster with clipping for linear and non-linear
heavy-tailed multi-armed bandits [85.27420062094086]
Implicitly Normalized Forecaster (INF) is considered an optimal solution for adversarial multi-armed bandit (MAB) problems.
We propose a new version of INF called the Implicitly Normalized Forecaster with clipping (INFclip) for MAB problems with heavy-tailed settings.
We demonstrate that INFclip is optimal for linear heavy-tailed MAB problems and works well for non-linear ones.
arXiv Detail & Related papers (2023-05-11T12:00:43Z) - Efficient Resource Allocation with Fairness Constraints in Restless
Multi-Armed Bandits [8.140037969280716]
Restless Multi-Armed Bandits (RMAB) is an apt model to represent decision-making problems in public health interventions.
In this paper, we are interested in ensuring that RMAB decision making is also fair to different arms while maximizing expected value.
arXiv Detail & Related papers (2022-06-08T13:28:29Z) - Finite-Time Regret of Thompson Sampling Algorithms for Exponential
Family Multi-Armed Bandits [88.21288104408556]
We study the regret of Thompson sampling (TS) algorithms for exponential family bandits, where the reward distribution is from a one-dimensional exponential family.
We propose a Thompson sampling, termed Expulli, which uses a novel sampling distribution to avoid the under-estimation of the optimal arm.
arXiv Detail & Related papers (2022-06-07T18:08:21Z) - Planning to Fairly Allocate: Probabilistic Fairness in the Restless
Bandit Setting [30.120134596715154]
We introduce ProbFair, a probabilistically fair policy that maximizes total expected reward and satisfies a budget constraint.
We evaluate our algorithm on a real-world application, where interventions support continuous positive airway pressure (CPAP) therapy adherence among patients.
arXiv Detail & Related papers (2021-06-14T18:01:08Z) - Collapsing Bandits and Their Application to Public Health Interventions [45.45852113386041]
Collpasing Bandits is a new restless multi-armed bandit (RMAB) setting in which each arm follows a binary-state Markovian process.
We build on the Whittle index technique for RMABs to derive conditions under which the Collapsing Bandits problem is indexable.
Our algorithm achieves a 3-order-of-magnitude speedup compared to state-of-the-art RMAB techniques.
arXiv Detail & Related papers (2020-07-05T00:33:30Z) - Robustness Guarantees for Mode Estimation with an Application to Bandits [131.21717367564963]
We introduce a theory for multi-armed bandits where the values are the modes of the reward distributions instead of the mean.
We show in simulations that our algorithms are robust to perturbation of the arms by adversarial noise sequences.
arXiv Detail & Related papers (2020-03-05T21:29:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.