Deep Reinforcement Learning for Efficient and Fair Allocation of Health Care Resources
- URL: http://arxiv.org/abs/2309.08560v2
- Date: Thu, 22 Aug 2024 05:05:13 GMT
- Title: Deep Reinforcement Learning for Efficient and Fair Allocation of Health Care Resources
- Authors: Yikuan Li, Chengsheng Mao, Kaixuan Huang, Hanyin Wang, Zheng Yu, Mengdi Wang, Yuan Luo,
- Abstract summary: Scarcity of health care resources could result in the unavoidable consequence of rationing.
There is no universally accepted standard for health care resource allocation protocols.
We propose a transformer-based deep Q-network to integrate the disease progression of individual patients and the interaction effects among patients.
- Score: 47.57108369791273
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Scarcity of health care resources could result in the unavoidable consequence of rationing. For example, ventilators are often limited in supply, especially during public health emergencies or in resource-constrained health care settings, such as amid the pandemic of COVID-19. Currently, there is no universally accepted standard for health care resource allocation protocols, resulting in different governments prioritizing patients based on various criteria and heuristic-based protocols. In this study, we investigate the use of reinforcement learning for critical care resource allocation policy optimization to fairly and effectively ration resources. We propose a transformer-based deep Q-network to integrate the disease progression of individual patients and the interaction effects among patients during the critical care resource allocation. We aim to improve both fairness of allocation and overall patient outcomes. Our experiments demonstrate that our method significantly reduces excess deaths and achieves a more equitable distribution under different levels of ventilator shortage, when compared to existing severity-based and comorbidity-based methods in use by different governments. Our source code is included in the supplement and will be released on Github upon publication.
Related papers
- Enhancing Performance for Highly Imbalanced Medical Data via Data Regularization in a Federated Learning Setting [6.22153888560487]
The goal of the proposed method is to enhance model performance for cardiovascular disease prediction.
The method is evaluated across four datasets for cardiovascular disease prediction, which are scattered across different clients.
arXiv Detail & Related papers (2024-05-30T19:15:38Z) - Reduced-Rank Multi-objective Policy Learning and Optimization [57.978477569678844]
In practice, causal researchers do not have a single outcome in mind a priori.
In government-assisted social benefit programs, policymakers collect many outcomes to understand the multidimensional nature of poverty.
We present a data-driven dimensionality-reduction methodology for multiple outcomes in the context of optimal policy learning.
arXiv Detail & Related papers (2024-04-29T08:16:30Z) - Learning Optimal and Fair Policies for Online Allocation of Scarce
Societal Resources from Data Collected in Deployment [5.0904557821667]
We use administrative data collected in deployment to design an online policy that maximizes expected outcomes while satisfying budget constraints.
We show that using our policies improves rates of exit from homelessness by 1.9% and that policies that are fair in either allocation or outcomes by race come at a very low price of fairness.
arXiv Detail & Related papers (2023-11-23T01:40:41Z) - Pruning the Way to Reliable Policies: A Multi-Objective Deep Q-Learning Approach to Critical Care [46.2482873419289]
We introduce a deep Q-learning approach to obtain more reliable critical care policies.
We evaluate our method in off-policy and offline settings using simulated environments and real health records from intensive care units.
arXiv Detail & Related papers (2023-06-13T18:02:57Z) - Data-pooling Reinforcement Learning for Personalized Healthcare
Intervention [20.436521180168455]
We develop a novel data-pooling reinforcement learning (RL) algorithm based on a general perturbed value iteration framework.
Our algorithm adaptively pools historical data, with three main innovations: (i) the weight of pooling ties directly to the performance of decision (measured by regret) as opposed to estimation accuracy in conventional methods.
We substantiate the theoretical development with empirically better performance of our algorithm via a case study in the context of post-discharge intervention to prevent unplanned readmissions.
arXiv Detail & Related papers (2022-11-16T15:52:49Z) - Reconciling Risk Allocation and Prevalence Estimation in Public Health
Using Batched Bandits [0.0]
In many public health settings, there is a perceived tension between allocating resources to known vulnerable areas and learning about the overall prevalence of the problem.
Inspired by a door-to-door Covid-19 testing program we helped design, we combine multi-armed bandit strategies and insights from sampling theory to demonstrate how to recover accurate prevalence estimates while continuing to allocate resources to at-risk areas.
arXiv Detail & Related papers (2021-10-25T22:33:46Z) - Towards a fairer reimbursement system for burn patients using
cost-sensitive classification [0.0]
The adoption of the Prospective Payment System (PPS) in the UK has led to the creation of Health Resource Groups (HRGs)
HRGs aim to identify groups of clinically similar patients that share similar resource usage for reimbursement purposes.
We propose a data-driven model and the inclusion of patient-level costing to improve homogeneity in resource usage and severity.
arXiv Detail & Related papers (2021-07-01T15:23:21Z) - The Medkit-Learn(ing) Environment: Medical Decision Modelling through
Simulation [81.72197368690031]
We present a new benchmarking suite designed specifically for medical sequential decision making.
The Medkit-Learn(ing) Environment is a publicly available Python package providing simple and easy access to high-fidelity synthetic medical data.
arXiv Detail & Related papers (2021-06-08T10:38:09Z) - Coordinated Online Learning for Multi-Agent Systems with Coupled
Constraints and Perturbed Utility Observations [91.02019381927236]
We introduce a novel method to steer the agents toward a stable population state, fulfilling the given resource constraints.
The proposed method is a decentralized resource pricing method based on the resource loads resulting from the augmentation of the game's Lagrangian.
arXiv Detail & Related papers (2020-10-21T10:11:17Z) - Predictive Modeling of ICU Healthcare-Associated Infections from
Imbalanced Data. Using Ensembles and a Clustering-Based Undersampling
Approach [55.41644538483948]
This work is focused on both the identification of risk factors and the prediction of healthcare-associated infections in intensive-care units.
The aim is to support decision making addressed at reducing the incidence rate of infections.
arXiv Detail & Related papers (2020-05-07T16:13:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.