Causal Bandits on General Graphs
- URL: http://arxiv.org/abs/2107.02772v1
- Date: Tue, 6 Jul 2021 17:29:45 GMT
- Title: Causal Bandits on General Graphs
- Authors: Aurghya Maiti, Vineet Nair, Gaurav Sinha
- Abstract summary: We study the problem of determining the best intervention in a Causal Bayesian Network (CBN) specified only by its causal graph.
We propose a simple regret minimization algorithm that takes as input a semi-Markovian causal graph with atomic interventions and possibly unobservable variables.
Our results indicate that the simple regret guarantee of our proposed algorithm can only be improved by considering more nuanced structural restrictions on the causal graph.
- Score: 1.4502611532302039
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We study the problem of determining the best intervention in a Causal
Bayesian Network (CBN) specified only by its causal graph. We model this as a
stochastic multi-armed bandit (MAB) problem with side-information, where the
interventions correspond to the arms of the bandit instance. First, we propose
a simple regret minimization algorithm that takes as input a semi-Markovian
causal graph with atomic interventions and possibly unobservable variables, and
achieves $\tilde{O}(\sqrt{M/T})$ expected simple regret, where $M$ is dependent
on the input CBN and could be very small compared to the number of arms. We
also show that this is almost optimal for CBNs described by causal graphs
having an $n$-ary tree structure. Our simple regret minimization results, both
upper and lower bound, subsume previous results in the literature, which
assumed additional structural restrictions on the input causal graph. In
particular, our results indicate that the simple regret guarantee of our
proposed algorithm can only be improved by considering more nuanced structural
restrictions on the causal graph. Next, we propose a cumulative regret
minimization algorithm that takes as input a general causal graph with all
observable nodes and atomic interventions and performs better than the optimal
MAB algorithm that does not take causal side-information into account. We also
experimentally compare both our algorithms with the best known algorithms in
the literature. To the best of our knowledge, this work gives the first simple
and cumulative regret minimization algorithms for CBNs with general causal
graphs under atomic interventions and having unobserved confounders.
Related papers
- Confounded Budgeted Causal Bandits [28.199741662190203]
We study the problem of learning 'good' interventions in an environment modeled by its underlying causal graph.
Good interventions refer to interventions that maximize rewards.
We propose an algorithm to minimize the cumulative regret in general causal graphs.
arXiv Detail & Related papers (2024-01-15T10:26:13Z) - Combinatorial Causal Bandits without Graph Skeleton [12.615590470530227]
We study the CCB problem without the graph structure on binary general causal models and BGLMs.
We design a regret algorithm for BGLMs even without the graph skeleton and show that it still achieves $O(sqrtTln T)$ expected regret.
We propose another algorithm with $O(Tfrac23ln T)$ regret to remove the weight gap assumption.
arXiv Detail & Related papers (2023-01-31T03:45:17Z) - Improved High-Probability Regret for Adversarial Bandits with
Time-Varying Feedback Graphs [62.52390282012508]
We study high-probability regret bounds for adversarial $K$-armed bandits with time-varying feedback graphs over $T$ rounds.
We develop an algorithm that achieves the optimal regret $widetildemathcalO((sum_t=1Talpha_t)1/2+max_tin[T]alpha_t]$ with high probability.
We also develop the first algorithm that achieves the optimal high-probability regret bound for weakly observable graphs.
arXiv Detail & Related papers (2022-10-04T04:36:15Z) - Causal Bandits for Linear Structural Equation Models [58.2875460517691]
This paper studies the problem of designing an optimal sequence of interventions in a causal graphical model.
It is assumed that the graph's structure is known and has $N$ nodes.
Two algorithms are proposed for the frequentist (UCB-based) and Bayesian settings.
arXiv Detail & Related papers (2022-08-26T16:21:31Z) - Verification and search algorithms for causal DAGs [11.038866111306728]
We give a characterization of a minimal sized set of atomic interventions to check the correctness of a claimed causal graph.
We generalize our results to the settings of bounded size interventions and node-dependent interventional costs.
Our result is the first known algorithm that gives a non-trivial approximation guarantee to the verifying size on general unweighted graphs.
arXiv Detail & Related papers (2022-06-30T15:52:30Z) - Minimax Optimal Quantization of Linear Models: Information-Theoretic
Limits and Efficient Algorithms [59.724977092582535]
We consider the problem of quantizing a linear model learned from measurements.
We derive an information-theoretic lower bound for the minimax risk under this setting.
We show that our method and upper-bounds can be extended for two-layer ReLU neural networks.
arXiv Detail & Related papers (2022-02-23T02:39:04Z) - The Performance of the MLE in the Bradley-Terry-Luce Model in
$\ell_{\infty}$-Loss and under General Graph Topologies [76.61051540383494]
We derive novel, general upper bounds on the $ell_infty$ estimation error of the Bradley-Terry-Luce model.
We demonstrate that the derived bounds perform well and in some cases are sharper compared to known results.
arXiv Detail & Related papers (2021-10-20T23:46:35Z) - Solving correlation clustering with QAOA and a Rydberg qudit system: a
full-stack approach [94.37521840642141]
We study the correlation clustering problem using the quantum approximate optimization algorithm (QAOA) and qudits.
Specifically, we consider a neutral atom quantum computer and propose a full stack approach for correlation clustering.
We show the qudit implementation is superior to the qubit encoding as quantified by the gate count.
arXiv Detail & Related papers (2021-06-22T11:07:38Z) - Budgeted and Non-budgeted Causal Bandits [2.9005223064604078]
Learning good interventions in a causal graph can be modelled as a multi-armed bandit problem with side-information.
We propose an algorithm to minimize simple regret that optimally trades-off observations and interventions based on the cost of intervention.
Our results are experimentally validated and compared to the best-known bounds in the current literature.
arXiv Detail & Related papers (2020-12-13T13:31:14Z) - An Asymptotically Optimal Primal-Dual Incremental Algorithm for
Contextual Linear Bandits [129.1029690825929]
We introduce a novel algorithm improving over the state-of-the-art along multiple dimensions.
We establish minimax optimality for any learning horizon in the special case of non-contextual linear bandits.
arXiv Detail & Related papers (2020-10-23T09:12:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.