Confounded Budgeted Causal Bandits
- URL: http://arxiv.org/abs/2401.07578v1
- Date: Mon, 15 Jan 2024 10:26:13 GMT
- Title: Confounded Budgeted Causal Bandits
- Authors: Fateme Jamshidi, Jalal Etesami, Negar Kiyavash
- Abstract summary: We study the problem of learning 'good' interventions in an environment modeled by its underlying causal graph.
Good interventions refer to interventions that maximize rewards.
We propose an algorithm to minimize the cumulative regret in general causal graphs.
- Score: 28.199741662190203
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the problem of learning 'good' interventions in a stochastic
environment modeled by its underlying causal graph. Good interventions refer to
interventions that maximize rewards. Specifically, we consider the setting of a
pre-specified budget constraint, where interventions can have non-uniform
costs. We show that this problem can be formulated as maximizing the expected
reward for a stochastic multi-armed bandit with side information. We propose an
algorithm to minimize the cumulative regret in general causal graphs. This
algorithm trades off observations and interventions based on their costs to
achieve the optimal reward. This algorithm generalizes the state-of-the-art
methods by allowing non-uniform costs and hidden confounders in the causal
graph. Furthermore, we develop an algorithm to minimize the simple regret in
the budgeted setting with non-uniform costs and also general causal graphs. We
provide theoretical guarantees, including both upper and lower bounds, as well
as empirical evaluations of our algorithms. Our empirical results showcase that
our algorithms outperform the state of the art.
Related papers
- Best Arm Identification with Fixed Budget: A Large Deviation Perspective [54.305323903582845]
We present sred, a truly adaptive algorithm that can reject arms in it any round based on the observed empirical gaps between the rewards of various arms.
In particular, we present sred, a truly adaptive algorithm that can reject arms in it any round based on the observed empirical gaps between the rewards of various arms.
arXiv Detail & Related papers (2023-12-19T13:17:43Z) - High-dimensional Contextual Bandit Problem without Sparsity [8.782204980889077]
We propose an explore-then-commit (EtC) algorithm to address this problem and examine its performance.
We derive the optimal rate of the ETC algorithm in terms of $T$ and show that this rate can be achieved by balancing exploration and exploitation.
We introduce an adaptive explore-then-commit (AEtC) algorithm that adaptively finds the optimal balance.
arXiv Detail & Related papers (2023-06-19T15:29:32Z) - Improved Regret for Efficient Online Reinforcement Learning with Linear
Function Approximation [69.0695698566235]
We study reinforcement learning with linear function approximation and adversarially changing cost functions.
We present a computationally efficient policy optimization algorithm for the challenging general setting of unknown dynamics and bandit feedback.
arXiv Detail & Related papers (2023-01-30T17:26:39Z) - Causal Bandits for Linear Structural Equation Models [58.2875460517691]
This paper studies the problem of designing an optimal sequence of interventions in a causal graphical model.
It is assumed that the graph's structure is known and has $N$ nodes.
Two algorithms are proposed for the frequentist (UCB-based) and Bayesian settings.
arXiv Detail & Related papers (2022-08-26T16:21:31Z) - Intervention Efficient Algorithm for Two-Stage Causal MDPs [15.838256272508357]
We study Markov Decision Processes (MDP) wherein states correspond to causal graphs that generate rewards.
In this setup, the learner's goal is to identify atomic interventions that lead to high rewards by intervening on variables at each state.
Generalizing the recent causal-bandit framework, the current work develops (simple) regret minimization guarantees for two-stage causal MDPs.
arXiv Detail & Related papers (2021-11-01T12:22:37Z) - Causal Bandits on General Graphs [1.4502611532302039]
We study the problem of determining the best intervention in a Causal Bayesian Network (CBN) specified only by its causal graph.
We propose a simple regret minimization algorithm that takes as input a semi-Markovian causal graph with atomic interventions and possibly unobservable variables.
Our results indicate that the simple regret guarantee of our proposed algorithm can only be improved by considering more nuanced structural restrictions on the causal graph.
arXiv Detail & Related papers (2021-07-06T17:29:45Z) - Budgeted and Non-budgeted Causal Bandits [2.9005223064604078]
Learning good interventions in a causal graph can be modelled as a multi-armed bandit problem with side-information.
We propose an algorithm to minimize simple regret that optimally trades-off observations and interventions based on the cost of intervention.
Our results are experimentally validated and compared to the best-known bounds in the current literature.
arXiv Detail & Related papers (2020-12-13T13:31:14Z) - Stochastic Hard Thresholding Algorithms for AUC Maximization [49.00683387735522]
We develop a hard thresholding algorithm for AUC in distributiond classification.
We conduct experiments to show the efficiency and effectiveness of the proposed algorithms.
arXiv Detail & Related papers (2020-11-04T16:49:29Z) - An Asymptotically Optimal Primal-Dual Incremental Algorithm for
Contextual Linear Bandits [129.1029690825929]
We introduce a novel algorithm improving over the state-of-the-art along multiple dimensions.
We establish minimax optimality for any learning horizon in the special case of non-contextual linear bandits.
arXiv Detail & Related papers (2020-10-23T09:12:47Z) - Budget-Constrained Bandits over General Cost and Reward Distributions [32.63624728528415]
We consider a budget-constrained bandit problem where each arm pull incurs a random cost, and yields a random reward in return.
We show that if moments of order $(2+gamma)$ for some $gamma > 0$ exist for all cost-reward pairs, $O(log B)$ regret is achievable for a budget $B>0$.
arXiv Detail & Related papers (2020-02-29T23:50:08Z) - Fair Correlation Clustering [92.15492066925977]
We obtain approximation algorithms for correlation clustering under several important types of fairness constraints.
We show that fair solutions to correlation clustering can be obtained with limited increase in cost compared to the state-of-the-art (unfair) algorithms.
arXiv Detail & Related papers (2020-02-06T14:28:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.