Related papers: Causal Bandits without prior knowledge using separating sets

Causal Bandits without prior knowledge using separating sets

URL: http://arxiv.org/abs/2009.07916v2
Date: Thu, 29 Sep 2022 12:33:53 GMT
Title: Causal Bandits without prior knowledge using separating sets
Authors: Arnoud A.W.M. de Kroon, Danielle Belgrave, Joris M. Mooij
Abstract summary: The Causal Bandit is a variant of the classic Bandit problem where an agent must identify the best action in a sequential decision-making process. Methods proposed for this problem thus far in the literature rely on exact prior knowledge of the full causal graph. We formulate new causal bandit algorithms that no longer necessarily rely on prior causal knowledge.
Score: 3.1000291317725
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The Causal Bandit is a variant of the classic Bandit problem where an agent must identify the best action in a sequential decision-making process, where the reward distribution of the actions displays a non-trivial dependence structure that is governed by a causal model. Methods proposed for this problem thus far in the literature rely on exact prior knowledge of the full causal graph. We formulate new causal bandit algorithms that no longer necessarily rely on prior causal knowledge. Instead, they utilize an estimator based on separating sets, which we can find using simple conditional independence tests or causal discovery methods. We show that, given a true separating set, for discrete i.i.d. data, this estimator is unbiased, and has variance which is upper bounded by that of the sample mean. We develop algorithms based on Thompson Sampling and UCB for discrete and Gaussian models respectively and show increased performance on simulation data as well as on a bandit drawing from real-world protein signaling data.

Related papers

The Minimal Search Space for Conditional Causal Bandits [0.18124328823188351]
Causal knowledge can be used to support decision-making problems. This paper presents a graphical characterization of the minimal set of nodes guaranteed to contain the optimal conditional intervention. We then propose an efficient algorithm with a time complexity of $O(|V| + |E|)$ to identify this minimal set of nodes.
arXiv Detail & Related papers (2025-02-10T15:45:18Z)
Causal bandits with backdoor adjustment on unknown Gaussian DAGs [5.807183284468881]
We study the causal bandit problem when the graph structure is unknown. We identify backdoor adjustment sets for each arm using sequentially generated experimental and observational data. We develop a novel bandit algorithm, based on modified upper confidence bounds, to sequentially determine the optimal intervention.
arXiv Detail & Related papers (2025-02-04T05:18:35Z)
Best Arm Identification with Fixed Budget: A Large Deviation Perspective [54.305323903582845]
We present sred, a truly adaptive algorithm that can reject arms in it any round based on the observed empirical gaps between the rewards of various arms. In particular, we present sred, a truly adaptive algorithm that can reject arms in it any round based on the observed empirical gaps between the rewards of various arms.
arXiv Detail & Related papers (2023-12-19T13:17:43Z)
Additive Causal Bandits with Unknown Graph [10.575089475850465]
We explore algorithms to select actions in the causal bandit setting where the learner can choose to intervene on a set of random variables related by a causal graph. The learner's goal is to quickly find the intervention, among all interventions on observable variables, that maximizes the expectation of an outcome variable.
arXiv Detail & Related papers (2023-06-13T15:43:04Z)
Bivariate Causal Discovery using Bayesian Model Selection [11.726586969589]
We show how to incorporate causal assumptions within the Bayesian framework. This enables us to construct models with realistic assumptions. We then outperform previous methods on a wide range of benchmark datasets.
arXiv Detail & Related papers (2023-06-05T14:51:05Z)
Optimal Clustering with Bandit Feedback [57.672609011609886]
This paper considers the problem of online clustering with bandit feedback. It includes a novel stopping rule for sequential testing that circumvents the need to solve any NP-hard weighted clustering problem as its subroutines. We show through extensive simulations on synthetic and real-world datasets that BOC's performance matches the lower boundally, and significantly outperforms a non-adaptive baseline algorithm.
arXiv Detail & Related papers (2022-02-09T06:05:05Z)
From Optimality to Robustness: Dirichlet Sampling Strategies in Stochastic Bandits [0.0]
We study a generic Dirichlet Sampling (DS) algorithm, based on pairwise comparisons of empirical indices computed with re-sampling of the arms' observations. We show that different variants of this strategy achieve provably optimal regret guarantees when the distributions are bounded and logarithmic regret for semi-bounded distributions with a mild quantile condition.
arXiv Detail & Related papers (2021-11-18T14:34:21Z)
Mean-based Best Arm Identification in Stochastic Bandits under Reward Contamination [80.53485617514707]
This paper proposes two algorithms, a gap-based algorithm and one based on the successive elimination, for best arm identification in sub-Gaussian bandits. Specifically, for the gap-based algorithm, the sample complexity is optimal up to constant factors, while for the successive elimination, it is optimal up to logarithmic factors.
arXiv Detail & Related papers (2021-11-14T21:49:58Z)
Universal and data-adaptive algorithms for model selection in linear contextual bandits [52.47796554359261]
We consider the simplest non-trivial instance of model-selection: distinguishing a simple multi-armed bandit problem from a linear contextual bandit problem. We introduce new algorithms that explore in a data-adaptive manner and provide guarantees of the form $mathcalO(dalpha T1- alpha)$. Our approach extends to model selection among nested linear contextual bandits under some additional assumptions.
arXiv Detail & Related papers (2021-11-08T18:05:35Z)
Estimation of Bivariate Structural Causal Models by Variational Gaussian Process Regression Under Likelihoods Parametrised by Normalising Flows [74.85071867225533]
Causal mechanisms can be described by structural causal models. One major drawback of state-of-the-art artificial intelligence is its lack of explainability.
arXiv Detail & Related papers (2021-09-06T14:52:58Z)
Causal Expectation-Maximisation [70.45873402967297]
We show that causal inference is NP-hard even in models characterised by polytree-shaped graphs. We introduce the causal EM algorithm to reconstruct the uncertainty about the latent variables from data about categorical manifest variables. We argue that there appears to be an unnoticed limitation to the trending idea that counterfactual bounds can often be computed without knowledge of the structural equations.
arXiv Detail & Related papers (2020-11-04T10:25:13Z)
A Causal Direction Test for Heterogeneous Populations [10.653162005300608]
Most causal models assume a single homogeneous population, an assumption that may fail to hold in many applications. We show that when the homogeneity assumption is violated, causal models developed based on such assumption can fail to identify the correct causal direction. We propose an adjustment to a commonly used causal direction test statistic by using a $k$-means type clustering algorithm.
arXiv Detail & Related papers (2020-06-08T18:59:14Z)
Inference for Batched Bandits [9.468593929311867]
We develop methods for inference on data collected in batches using a bandit algorithm. We first prove that the ordinary least squares estimator (OLS) is notally normal on data collected using standard bandit algorithms when there is no unique optimal arm. Second, we introduce the Batched OLS estimator (BOLS) that we prove is (1) normal on data collected from both multi-arm and contextual bandits and (2) robust to non-stationarity in the baseline reward.
arXiv Detail & Related papers (2020-02-08T18:59:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.