Related papers: Causal Bandits with Unknown Graph Structure

Causal Bandits with Unknown Graph Structure

URL: http://arxiv.org/abs/2106.02988v1
Date: Sat, 5 Jun 2021 23:41:38 GMT
Title: Causal Bandits with Unknown Graph Structure
Authors: Yangyi Lu, Amirhossein Meisami, Ambuj Tewari
Abstract summary: We develop novel causal bandit algorithms without knowing the causal graph. Our algorithms work well for causal trees, causal forests and a general class of causal graphs.
Score: 24.58691421788476
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In causal bandit problems, the action set consists of interventions on variables of a causal graph. Several researchers have recently studied such bandit problems and pointed out their practical applications. However, all existing works rely on a restrictive and impractical assumption that the learner is given full knowledge of the causal graph structure upfront. In this paper, we develop novel causal bandit algorithms without knowing the causal graph. Our algorithms work well for causal trees, causal forests and a general class of causal graphs. The regret guarantees of our algorithms greatly improve upon those of standard multi-armed bandit (MAB) algorithms under mild conditions. Lastly, we prove our mild conditions are necessary: without them one cannot do better than standard MAB bandit algorithms.

Related papers

The Minimal Search Space for Conditional Causal Bandits [0.18124328823188351]
Causal knowledge can be used to support decision-making problems. This paper presents a graphical characterization of the minimal set of nodes guaranteed to contain the optimal conditional intervention. We then propose an efficient algorithm with a time complexity of $O(|V| + |E|)$ to identify this minimal set of nodes.
arXiv Detail & Related papers (2025-02-10T15:45:18Z)
Causal bandits with backdoor adjustment on unknown Gaussian DAGs [5.807183284468881]
We study the causal bandit problem when the graph structure is unknown. We identify backdoor adjustment sets for each arm using sequentially generated experimental and observational data. We develop a novel bandit algorithm, based on modified upper confidence bounds, to sequentially determine the optimal intervention.
arXiv Detail & Related papers (2025-02-04T05:18:35Z)
Bridging Rested and Restless Bandits with Graph-Triggering: Rising and Rotting [67.1631453378926]
Graph-Triggered Bandits is a framework to generalize rested and restless bandits. In this work, we focus on two specific types of monotonic bandits: rising, where the expected reward of an arm grows as the number of triggers increases, and rotting, where the opposite behavior occurs.
arXiv Detail & Related papers (2024-09-09T18:23:07Z)
Zero-Inflated Bandits [11.60342504007264]
We study zero-inflated bandits, where the reward is modeled as a classic semi-parametric distribution called zero-inflated distribution. We derive the regret bounds under both multi-armed bandits with general reward assumptions and contextual generalized linear bandit with sub-Gaussian rewards. In many settings, the regret rates of our algorithms are either minimax optimal or state-of-the-art.
arXiv Detail & Related papers (2023-12-25T03:13:21Z)
Bandit Social Learning: Exploration under Myopic Behavior [58.75758600464338]
We study social learning dynamics motivated by reviews on online platforms. Agents collectively follow a simple multi-armed bandit protocol, but each agent acts myopically, without regards to exploration. We derive stark learning failures for any such behavior, and provide matching positive results.
arXiv Detail & Related papers (2023-02-15T01:57:57Z)
Causal Bandits without Graph Learning [28.021500949026766]
We develop an efficient algorithm for finding the parent node of the reward node using atomic interventions. We extend our algorithm to the case when the reward node has multiple parents.
arXiv Detail & Related papers (2023-01-26T20:27:14Z)
Pure Exploration of Causal Bandits [9.77519365079468]
Causal bandit problem integrates causal inference with multi-armed bandits. Online learning task: given a causal graph with unknown causal inference distributions, we can choose to either intervene one variable or do no intervention. We provide first gap-dependent fully adaptive pure exploration algorithms on three types of causal models.
arXiv Detail & Related papers (2022-06-16T02:19:37Z)
Bandits for Structure Perturbation-based Black-box Attacks to Graph Neural Networks with Theoretical Guarantees [60.61846004535707]
Graph neural networks (GNNs) have achieved state-of-the-art performance in many graph-based tasks. An attacker can mislead GNN models by slightly perturbing the graph structure. In this paper, we consider black-box attacks to GNNs with structure perturbation as well as with theoretical guarantees.
arXiv Detail & Related papers (2022-05-07T04:17:25Z)
Optimal Gradient-based Algorithms for Non-concave Bandit Optimization [76.57464214864756]
This work considers a large family of bandit problems where the unknown underlying reward function is non-concave. Our algorithms are based on a unified zeroth-order optimization paradigm that applies in great generality. We show that the standard optimistic algorithms are sub-optimal by dimension factors.
arXiv Detail & Related papers (2021-07-09T16:04:24Z)
Causal Bandits on General Graphs [1.4502611532302039]
We study the problem of determining the best intervention in a Causal Bayesian Network (CBN) specified only by its causal graph. We propose a simple regret minimization algorithm that takes as input a semi-Markovian causal graph with atomic interventions and possibly unobservable variables. Our results indicate that the simple regret guarantee of our proposed algorithm can only be improved by considering more nuanced structural restrictions on the causal graph.
arXiv Detail & Related papers (2021-07-06T17:29:45Z)
Upper Confidence Bounds for Combining Stochastic Bandits [52.10197476419621]
We provide a simple method to combine bandit algorithms. Our approach is based on a "meta-UCB" procedure that treats each of $N$ individual bandit algorithms as arms in a higher-level $N$-armed bandit problem.
arXiv Detail & Related papers (2020-12-24T05:36:29Z)
Approximation Theory Based Methods for RKHS Bandits [9.391375268580806]
The RKHS bandit problem is an online optimization problem of non-linear functions with noisy feedback. There is no general algorithm for the adversarial RKHS bandit problem. We propose efficient algorithms for the RKHS bandit problem and the first general algorithm for the adversarial RKHS bandit problem.
arXiv Detail & Related papers (2020-10-23T05:14:21Z)
Corralling Stochastic Bandit Algorithms [54.10645564702416]
We show that the regret of the corralling algorithms is no worse than that of the best algorithm containing the arm with the highest reward. We show that the gap between the highest reward and other rewards depends on the gap between the highest reward and other rewards.
arXiv Detail & Related papers (2020-06-16T15:33:12Z)
A Novel Confidence-Based Algorithm for Structured Bandits [129.30402124516507]
We study finite-armed bandits where the rewards of each arm might be correlated to those of other arms. We introduce a novel phased algorithm that exploits the given structure to build confidence sets over the parameters of the true bandit problem.
arXiv Detail & Related papers (2020-05-23T19:52:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.