Learn to Intervene: An Adaptive Learning Policy for Restless Bandits in
Application to Preventive Healthcare
- URL: http://arxiv.org/abs/2105.07965v1
- Date: Mon, 17 May 2021 15:44:55 GMT
- Title: Learn to Intervene: An Adaptive Learning Policy for Restless Bandits in
Application to Preventive Healthcare
- Authors: Arpita Biswas, Gaurav Aggarwal, Pradeep Varakantham, Milind Tambe
- Abstract summary: We propose a Whittle index based Q-Learning mechanism for restless multi-armed bandit (RMAB) problems.
Our method improves over existing learning-based methods for RMABs on multiple benchmarks from literature and also on the maternal healthcare dataset.
- Score: 39.41918282603752
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In many public health settings, it is important for patients to adhere to
health programs, such as taking medications and periodic health checks.
Unfortunately, beneficiaries may gradually disengage from such programs, which
is detrimental to their health. A concrete example of gradual disengagement has
been observed by an organization that carries out a free automated call-based
program for spreading preventive care information among pregnant women. Many
women stop picking up calls after being enrolled for a few months. To avoid
such disengagements, it is important to provide timely interventions. Such
interventions are often expensive and can be provided to only a small fraction
of the beneficiaries. We model this scenario as a restless multi-armed bandit
(RMAB) problem, where each beneficiary is assumed to transition from one state
to another depending on the intervention. Moreover, since the transition
probabilities are unknown a priori, we propose a Whittle index based Q-Learning
mechanism and show that it converges to the optimal solution. Our method
improves over existing learning-based methods for RMABs on multiple benchmarks
from literature and also on the maternal healthcare dataset.
Related papers
- Improving Health Information Access in the World's Largest Maternal Mobile Health Program via Bandit Algorithms [24.4450506603579]
This paper focuses on Kilkari, the world's largest mHealth program for maternal and child care.
We present a system called CHAHAK that aims to reduce automated dropouts as well as boost engagement with the program.
arXiv Detail & Related papers (2024-05-14T07:21:49Z) - Deep Reinforcement Learning for Efficient and Fair Allocation of Health
Care Resources [49.956569971833105]
Scarcity of health care resources could result in the unavoidable consequence of rationing.
There is no universally accepted standard for health care resource allocation protocols.
We propose a transformer-based deep Q-network to integrate the disease progression of individual patients and the interaction effects among patients.
arXiv Detail & Related papers (2023-09-15T17:28:06Z) - Pruning the Way to Reliable Policies: A Multi-Objective Deep Q-Learning
Approach to Critical Care [68.8204255655161]
We introduce a deep Q-learning approach able to obtain more reliable critical care policies.
We achieve this by first pruning the action set based on all available rewards, and second training a final model based on the sparse main reward but with a restricted action set.
arXiv Detail & Related papers (2023-06-13T18:02:57Z) - When to Ask for Help: Proactive Interventions in Autonomous
Reinforcement Learning [57.53138994155612]
A long-term goal of reinforcement learning is to design agents that can autonomously interact and learn in the world.
A critical challenge is the presence of irreversible states which require external assistance to recover from, such as when a robot arm has pushed an object off of a table.
We propose an algorithm that efficiently learns to detect and avoid states that are irreversible, and proactively asks for help in case the agent does enter them.
arXiv Detail & Related papers (2022-10-19T17:57:24Z) - Towards Soft Fairness in Restless Multi-Armed Bandits [8.140037969280716]
Restless multi-armed bandits (RMAB) is a framework for allocating limited resources under uncertainty.
To avoid starvation in the executed interventions across individuals/regions/communities, we first provide a soft fairness constraint.
We then provide an approach to enforce the soft fairness constraint in RMABs.
arXiv Detail & Related papers (2022-07-27T07:56:32Z) - The Survival Bandit Problem [65.68378556428861]
We introduce and study a new variant of the multi-armed bandit problem (MAB), called the survival bandit problem (S-MAB)
While in both problems, the objective is to maximize the so-called cumulative reward, in this new variant, the procedure is interrupted if the cumulative reward falls below a preset threshold.
This simple yet unexplored extension of the MAB follows from many practical applications.
arXiv Detail & Related papers (2022-06-07T05:23:14Z) - Field Study in Deploying Restless Multi-Armed Bandits: Assisting
Non-Profits in Improving Maternal and Child Health [28.43878945119807]
Cell phones have enabled non-profits to deliver critical health information to their beneficiaries in a timely manner.
A key challenge in such information delivery programs is that a significant fraction of beneficiaries drop out of the program.
We developed a Restless Multi-Armed Bandits system to help non-profits place crucial service calls for live interaction with beneficiaries to prevent such engagement drops.
arXiv Detail & Related papers (2021-09-16T16:04:48Z) - Selective Intervention Planning using RMABs: Increasing Program
Engagement to Improve Maternal and Child Health Outcomes [34.38042786168279]
We work with ARMMAN, a non-profit based in India, to further the use of call-based information programs.
We analyzed anonymized call-records of over 300,000 women registered in an awareness program.
We built machine learning based models to predict the long term engagement pattern from call logs and beneficiaries' demographic information.
arXiv Detail & Related papers (2021-03-07T08:47:24Z) - Collapsing Bandits and Their Application to Public Health Interventions [45.45852113386041]
Collpasing Bandits is a new restless multi-armed bandit (RMAB) setting in which each arm follows a binary-state Markovian process.
We build on the Whittle index technique for RMABs to derive conditions under which the Collapsing Bandits problem is indexable.
Our algorithm achieves a 3-order-of-magnitude speedup compared to state-of-the-art RMAB techniques.
arXiv Detail & Related papers (2020-07-05T00:33:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.