Identifiable latent bandits: Combining observational data and exploration for personalized healthcare
- URL: http://arxiv.org/abs/2407.16239v2
- Date: Mon, 29 Jul 2024 14:04:20 GMT
- Title: Identifiable latent bandits: Combining observational data and exploration for personalized healthcare
- Authors: Ahmet Zahid Balcıoğlu, Emil Carlsson, Fredrik D. Johansson,
- Abstract summary: Bandit algorithms hold great promise for improving personalized decision-making but are notoriously sample-hungry.
In most health applications, it is infeasible to fit a new bandit for each patient, and observable variables are often insufficient to determine optimal treatments.
Latent bandits offer both rapid exploration and personalization beyond what context variables can reveal.
We propose bandit algorithms based on nonlinear independent component analysis that can be provably identified from observational data to a degree sufficient to infer the optimal action in a new bandit instance consistently.
- Score: 7.731569068280131
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Bandit algorithms hold great promise for improving personalized decision-making but are notoriously sample-hungry. In most health applications, it is infeasible to fit a new bandit for each patient, and observable variables are often insufficient to determine optimal treatments, ruling out applying contextual bandits learned from multiple patients. Latent bandits offer both rapid exploration and personalization beyond what context variables can reveal but require that a latent variable model can be learned consistently. In this work, we propose bandit algorithms based on nonlinear independent component analysis that can be provably identified from observational data to a degree sufficient to infer the optimal action in a new bandit instance consistently. We verify this strategy in simulated data, showing substantial improvement over learning independent multi-armed bandits for every instance.
Related papers
- Thompson sampling for zero-inflated count outcomes with an application to the Drink Less mobile health study [1.5936659933030128]
Mobile health interventions aim to improve distal outcomes, such as clinical conditions, by optimizing proximal outcomes through just-in-time adaptive interventions.
Contextual bandits provide a suitable framework for customizing such interventions according to individual time-varying contexts.
The current work addresses this challenge by leveraging count data models into online decision-making approaches.
arXiv Detail & Related papers (2023-11-24T09:02:24Z) - Robust Best-arm Identification in Linear Bandits [25.91361349646875]
We study the robust best-arm identification problem (RBAI) in the case of linear rewards.
We present an instance-dependent lower bound for the robust best-arm identification problem with linear rewards.
Our algorithms prove to be effective in identifying robust dosage values across various age ranges of patients.
arXiv Detail & Related papers (2023-11-08T14:58:11Z) - Leveraging Unlabelled Data in Multiple-Instance Learning Problems for
Improved Detection of Parkinsonian Tremor in Free-Living Conditions [80.88681952022479]
We introduce a new method for combining semi-supervised with multiple-instance learning.
We show that by leveraging the unlabelled data of 454 subjects we can achieve large performance gains in per-subject tremor detection.
arXiv Detail & Related papers (2023-04-29T12:25:10Z) - ORF-Net: Deep Omni-supervised Rib Fracture Detection from Chest CT Scans [47.7670302148812]
radiologists need to investigate and annotate rib fractures on a slice-by-slice basis.
We propose a novel omni-supervised object detection network, which can exploit multiple different forms of annotated data.
Our proposed method outperforms other state-of-the-art approaches consistently.
arXiv Detail & Related papers (2022-07-05T07:06:57Z) - MURAL: An Unsupervised Random Forest-Based Embedding for Electronic
Health Record Data [59.26381272149325]
We present an unsupervised random forest for representing data with disparate variable types.
MURAL forests consist of a set of decision trees where node-splitting variables are chosen at random.
We show that using our approach, we can visualize and classify data more accurately than competing approaches.
arXiv Detail & Related papers (2021-11-19T22:02:21Z) - From Optimality to Robustness: Dirichlet Sampling Strategies in
Stochastic Bandits [0.0]
We study a generic Dirichlet Sampling (DS) algorithm, based on pairwise comparisons of empirical indices computed with re-sampling of the arms' observations.
We show that different variants of this strategy achieve provably optimal regret guarantees when the distributions are bounded and logarithmic regret for semi-bounded distributions with a mild quantile condition.
arXiv Detail & Related papers (2021-11-18T14:34:21Z) - Reconciling Risk Allocation and Prevalence Estimation in Public Health
Using Batched Bandits [0.0]
In many public health settings, there is a perceived tension between allocating resources to known vulnerable areas and learning about the overall prevalence of the problem.
Inspired by a door-to-door Covid-19 testing program we helped design, we combine multi-armed bandit strategies and insights from sampling theory to demonstrate how to recover accurate prevalence estimates while continuing to allocate resources to at-risk areas.
arXiv Detail & Related papers (2021-10-25T22:33:46Z) - Analysis of Thompson Sampling for Partially Observable Contextual
Multi-Armed Bandits [1.8275108630751844]
We propose a Thompson Sampling algorithm for partially observable contextual multi-armed bandits.
We show that the regret of the presented policy scales logarithmically with time and the number of arms, and linearly with the dimension.
arXiv Detail & Related papers (2021-10-23T08:51:49Z) - Bias-Robust Bayesian Optimization via Dueling Bandit [57.82422045437126]
We consider Bayesian optimization in settings where observations can be adversarially biased.
We propose a novel approach for dueling bandits based on information-directed sampling (IDS)
Thereby, we obtain the first efficient kernelized algorithm for dueling bandits that comes with cumulative regret guarantees.
arXiv Detail & Related papers (2021-05-25T10:08:41Z) - Robustness Guarantees for Mode Estimation with an Application to Bandits [131.21717367564963]
We introduce a theory for multi-armed bandits where the values are the modes of the reward distributions instead of the mean.
We show in simulations that our algorithms are robust to perturbation of the arms by adversarial noise sequences.
arXiv Detail & Related papers (2020-03-05T21:29:27Z) - The Simulator: Understanding Adaptive Sampling in the
Moderate-Confidence Regime [52.38455827779212]
We propose a novel technique for analyzing adaptive sampling called the em Simulator.
We prove the first instance-based lower bounds the top-k problem which incorporate the appropriate log-factors.
Our new analysis inspires a simple and near-optimal for the best-arm and top-k identification, the first em practical of its kind for the latter problem.
arXiv Detail & Related papers (2017-02-16T23:42:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.