Limited Resource Allocation in a Non-Markovian World: The Case of
Maternal and Child Healthcare
- URL: http://arxiv.org/abs/2305.12640v1
- Date: Mon, 22 May 2023 02:26:29 GMT
- Title: Limited Resource Allocation in a Non-Markovian World: The Case of
Maternal and Child Healthcare
- Authors: Panayiotis Danassis, Shresth Verma, Jackson A. Killian, Aparna Taneja,
Milind Tambe
- Abstract summary: We consider the problem of scheduling interventions in low resource settings to increase adherence and/or engagement.
Past works have successfully developed several classes of Restless Multi-armed Bandit (RMAB) based solutions for this problem.
We demonstrate significant deviations from the Markov assumption on real-world data on a maternal health awareness program from our partner NGO, ARMMAN.
To tackle the generalised non-Markovian RMAB setting we (i) model each participant's trajectory as a time-series, (ii) leverage the power of time-series forecasting models to predict future states, and (iii) propose the Time
- Score: 27.812174610119452
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The success of many healthcare programs depends on participants' adherence.
We consider the problem of scheduling interventions in low resource settings
(e.g., placing timely support calls from health workers) to increase adherence
and/or engagement. Past works have successfully developed several classes of
Restless Multi-armed Bandit (RMAB) based solutions for this problem.
Nevertheless, all past RMAB approaches assume that the participants' behaviour
follows the Markov property. We demonstrate significant deviations from the
Markov assumption on real-world data on a maternal health awareness program
from our partner NGO, ARMMAN. Moreover, we extend RMABs to continuous state
spaces, a previously understudied area. To tackle the generalised non-Markovian
RMAB setting we (i) model each participant's trajectory as a time-series, (ii)
leverage the power of time-series forecasting models to learn complex patterns
and dynamics to predict future states, and (iii) propose the Time-series Arm
Ranking Index (TARI) policy, a novel algorithm that selects the RMAB arms that
will benefit the most from an intervention, given our future state predictions.
We evaluate our approach on both synthetic data, and a secondary analysis on
real data from ARMMAN, and demonstrate significant increase in engagement
compared to the SOTA, deployed Whittle index solution. This translates to 16.3
hours of additional content listened, 90.8% more engagement drops prevented,
and reaching more than twice as many high dropout-risk beneficiaries.
Related papers
- Efficient Public Health Intervention Planning Using Decomposition-Based
Decision-Focused Learning [33.14258196945301]
We show how to exploit the structure of Restless Multi-Armed Bandits (RMABs) to speed up intervention planning.
We use real-world data from an Indian NGO, ARMMAN, to show that our approach is up to two orders of magnitude faster than the state-of-the-art approach.
arXiv Detail & Related papers (2024-03-08T21:31:00Z) - A Bayesian Approach to Online Learning for Contextual Restless Bandits with Applications to Public Health [36.83063109531146]
We present Bayesian Learning for Contextual RMABs (BCoR), an online RL approach for RMABs with unknown underlying transition dynamics.
BCoR's key strength is the ability to leverage shared information within and between arms to learn the unknown RMAB transition dynamics quickly.
arXiv Detail & Related papers (2024-02-07T15:11:37Z) - Optimal Best Arm Identification with Fixed Confidence in Restless Bandits [66.700654953613]
We study best arm identification in a restless multi-armed bandit setting with finitely many arms.
The discrete-time data generated by each arm forms a homogeneous Markov chain taking values in a common, finite state space.
It is demonstrated that tracking the long-term behavior of a certain Markov decision process and its state-action visitation proportions are the key ingredients in analyzing the converse and achievability bounds.
arXiv Detail & Related papers (2023-10-20T10:04:05Z) - Fairness for Workers Who Pull the Arms: An Index Based Policy for
Allocation of Restless Bandit Tasks [30.323831598899773]
We consider a novel RMAB setting, called multi-worker restless bandits (MWRMAB) with heterogeneous workers.
The goal is to plan an intervention schedule that maximizes the expected reward while satisfying budget constraints on each worker.
Our contributions are two-fold: (1) we provide a multi-worker extension of the Whittle index to tackle heterogeneous costs and per-worker budget and (2) we develop an index-based scheduling policy to achieve fairness.
arXiv Detail & Related papers (2023-03-01T19:59:42Z) - Efficient Resource Allocation with Fairness Constraints in Restless
Multi-Armed Bandits [8.140037969280716]
Restless Multi-Armed Bandits (RMAB) is an apt model to represent decision-making problems in public health interventions.
In this paper, we are interested in ensuring that RMAB decision making is also fair to different arms while maximizing expected value.
arXiv Detail & Related papers (2022-06-08T13:28:29Z) - Multi-Armed Bandit Problem with Temporally-Partitioned Rewards: When
Partial Feedback Counts [53.579515853222986]
We study a novel bandit setting, namely Multi-Armed Bandit with Temporally-Partitioned Rewards (TP-MAB)
This setting is a natural extension of delayed-feedback bandits to the case in which rewards may be dilated over a finite-time span after the pull.
We provide two algorithms to address TP-MAB problems, namely, TP-UCB-FR and TP-UCB-EW.
arXiv Detail & Related papers (2022-06-01T15:56:59Z) - Best Arm Identification in Restless Markov Multi-Armed Bandits [85.55466536537293]
We study the problem of identifying the best arm in a multi-armed bandit environment.
A decision entity wishes to find the index of the best arm as quickly as possible, subject to an upper bound error probability.
We show that this policy achieves an upper bound that depends on $R$ and is monotonically non-increasing as $Rtoinfty$.
arXiv Detail & Related papers (2022-03-29T04:58:04Z) - Addressing the Long-term Impact of ML Decisions via Policy Regret [49.92903850297013]
We study a setting in which the reward from each arm evolves every time the decision-maker pulls that arm.
We argue that an acceptable sequential allocation of opportunities must take an arm's potential for growth into account.
We present an algorithm with provably sub-linear policy regret for sufficiently long time horizons.
arXiv Detail & Related papers (2021-06-02T17:38:10Z) - Approximate Bayesian Computation for an Explicit-Duration Hidden Markov
Model of COVID-19 Hospital Trajectories [55.786207368853084]
We address the problem of modeling constrained hospital resources in the midst of the COVID-19 pandemic.
For broad applicability, we focus on the common yet challenging scenario where patient-level data for a region of interest are not available.
We propose an aggregate count explicit-duration hidden Markov model, nicknamed the ACED-HMM, with an interpretable, compact parameterization.
arXiv Detail & Related papers (2021-04-28T15:32:42Z) - Collapsing Bandits and Their Application to Public Health Interventions [45.45852113386041]
Collpasing Bandits is a new restless multi-armed bandit (RMAB) setting in which each arm follows a binary-state Markovian process.
We build on the Whittle index technique for RMABs to derive conditions under which the Collapsing Bandits problem is indexable.
Our algorithm achieves a 3-order-of-magnitude speedup compared to state-of-the-art RMAB techniques.
arXiv Detail & Related papers (2020-07-05T00:33:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.