Related papers: Efficient Public Health Intervention Planning Using Decomposition-Based Decision-Focused Learning

Efficient Public Health Intervention Planning Using Decomposition-Based Decision-Focused Learning

URL: http://arxiv.org/abs/2403.05683v1
Date: Fri, 8 Mar 2024 21:31:00 GMT
Title: Efficient Public Health Intervention Planning Using Decomposition-Based Decision-Focused Learning
Authors: Sanket Shah, Arun Suggala, Milind Tambe, Aparna Taneja
Abstract summary: We show how to exploit the structure of Restless Multi-Armed Bandits (RMABs) to speed up intervention planning. We use real-world data from an Indian NGO, ARMMAN, to show that our approach is up to two orders of magnitude faster than the state-of-the-art approach.
Score: 33.14258196945301
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The declining participation of beneficiaries over time is a key concern in public health programs. A popular strategy for improving retention is to have health workers `intervene' on beneficiaries at risk of dropping out. However, the availability and time of these health workers are limited resources. As a result, there has been a line of research on optimizing these limited intervention resources using Restless Multi-Armed Bandits (RMABs). The key technical barrier to using this framework in practice lies in the need to estimate the beneficiaries' RMAB parameters from historical data. Recent research has shown that Decision-Focused Learning (DFL), which focuses on maximizing the beneficiaries' adherence rather than predictive accuracy, improves the performance of intervention targeting using RMABs. Unfortunately, these gains come at a high computational cost because of the need to solve and evaluate the RMAB in each DFL training step. In this paper, we provide a principled way to exploit the structure of RMABs to speed up intervention planning by cleverly decoupling the planning for different beneficiaries. We use real-world data from an Indian NGO, ARMMAN, to show that our approach is up to two orders of magnitude faster than the state-of-the-art approach while also yielding superior model performance. This would enable the NGO to scale up deployments using DFL to potentially millions of mothers, ultimately advancing progress toward UNSDG 3.1.

Related papers

Self-Regulation and Requesting Interventions [63.5863047447313]
We propose an offline framework that trains a "helper" policy to request interventions. We score optimal intervention timing with PRMs and train the helper model on these labeled trajectories. This offline approach significantly reduces costly intervention calls during training.
arXiv Detail & Related papers (2025-02-07T00:06:17Z)
IRL for Restless Multi-Armed Bandits with Applications in Maternal and Child Health [52.79219652923714]
This paper is the first to present the use of inverse reinforcement learning (IRL) to learn desired rewards for RMABs. We demonstrate improved outcomes in a maternal and child health telehealth program.
arXiv Detail & Related papers (2024-12-11T15:28:04Z)
Bayesian Collaborative Bandits with Thompson Sampling for Improved Outreach in Maternal Health Program [36.10003434625494]
Mobile health (mHealth) programs face a critical challenge in optimizing the timing of automated health information calls to beneficiaries. We propose a principled approach using Thompson Sampling for this collaborative bandit problem. We demonstrate significant improvements over state-of-the-art baselines on a real-world dataset from the world's largest maternal mHealth program.
arXiv Detail & Related papers (2024-10-28T18:08:18Z)
Optimizing Vital Sign Monitoring in Resource-Constrained Maternal Care: An RL-Based Restless Bandit Approach [31.228987526386558]
Wireless vital sign monitoring devices offer a labor-efficient solution for continuous monitoring. We devise an allocation algorithm for this problem by modeling it as a variant of the popular Restless Multi-Armed Bandit paradigm. We demonstrate in simulations that our approach outperforms the best baseline by up to a factor of $4$.
arXiv Detail & Related papers (2024-10-10T21:20:07Z)
Bi-Factorial Preference Optimization: Balancing Safety-Helpfulness in Language Models [94.39278422567955]
Fine-tuning large language models (LLMs) on human preferences has proven successful in enhancing their capabilities. However, ensuring the safety of LLMs during the fine-tuning remains a critical concern. We propose a supervised learning framework called Bi-Factorial Preference Optimization (BFPO) to address this issue.
arXiv Detail & Related papers (2024-08-27T17:31:21Z)
Reduced-Rank Multi-objective Policy Learning and Optimization [57.978477569678844]
In practice, causal researchers do not have a single outcome in mind a priori. In government-assisted social benefit programs, policymakers collect many outcomes to understand the multidimensional nature of poverty. We present a data-driven dimensionality-reduction methodology for multiple outcomes in the context of optimal policy learning.
arXiv Detail & Related papers (2024-04-29T08:16:30Z)
Limited Resource Allocation in a Non-Markovian World: The Case of Maternal and Child Healthcare [27.812174610119452]
We consider the problem of scheduling interventions in low resource settings to increase adherence and/or engagement. Past works have successfully developed several classes of Restless Multi-armed Bandit (RMAB) based solutions for this problem. We demonstrate significant deviations from the Markov assumption on real-world data on a maternal health awareness program from our partner NGO, ARMMAN. To tackle the generalised non-Markovian RMAB setting we (i) model each participant's trajectory as a time-series, (ii) leverage the power of time-series forecasting models to predict future states, and (iii) propose the Time
arXiv Detail & Related papers (2023-05-22T02:26:29Z)
Computation Offloading and Resource Allocation in F-RANs: A Federated Deep Reinforcement Learning Approach [67.06539298956854]
fog radio access network (F-RAN) is a promising technology in which the user mobile devices (MDs) can offload computation tasks to the nearby fog access points (F-APs)
arXiv Detail & Related papers (2022-06-13T02:19:20Z)
Efficient Resource Allocation with Fairness Constraints in Restless Multi-Armed Bandits [8.140037969280716]
Restless Multi-Armed Bandits (RMAB) is an apt model to represent decision-making problems in public health interventions. In this paper, we are interested in ensuring that RMAB decision making is also fair to different arms while maximizing expected value.
arXiv Detail & Related papers (2022-06-08T13:28:29Z)
Field Study in Deploying Restless Multi-Armed Bandits: Assisting Non-Profits in Improving Maternal and Child Health [28.43878945119807]
Cell phones have enabled non-profits to deliver critical health information to their beneficiaries in a timely manner. A key challenge in such information delivery programs is that a significant fraction of beneficiaries drop out of the program. We developed a Restless Multi-Armed Bandits system to help non-profits place crucial service calls for live interaction with beneficiaries to prevent such engagement drops.
arXiv Detail & Related papers (2021-09-16T16:04:48Z)
Contingency-Aware Influence Maximization: A Reinforcement Learning Approach [52.109536198330126]
influence (IM) problem aims at finding a subset of seed nodes in a social network that maximize the spread of influence. In this study, we focus on a sub-class of IM problems, where whether the nodes are willing to be the seeds when being invited is uncertain, called contingency-aware IM. Despite the initial success, a major practical obstacle in promoting the solutions to more communities is the tremendous runtime of the greedy algorithms.
arXiv Detail & Related papers (2021-06-13T16:42:22Z)
Addressing the Long-term Impact of ML Decisions via Policy Regret [49.92903850297013]
We study a setting in which the reward from each arm evolves every time the decision-maker pulls that arm. We argue that an acceptable sequential allocation of opportunities must take an arm's potential for growth into account. We present an algorithm with provably sub-linear policy regret for sufficiently long time horizons.
arXiv Detail & Related papers (2021-06-02T17:38:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.