Related papers: Budgeted and Non-budgeted Causal Bandits

Budgeted and Non-budgeted Causal Bandits

URL: http://arxiv.org/abs/2012.07058v1
Date: Sun, 13 Dec 2020 13:31:14 GMT
Title: Budgeted and Non-budgeted Causal Bandits
Authors: Vineet Nair, Vishakha Patil, Gaurav Sinha
Abstract summary: Learning good interventions in a causal graph can be modelled as a multi-armed bandit problem with side-information. We propose an algorithm to minimize simple regret that optimally trades-off observations and interventions based on the cost of intervention. Our results are experimentally validated and compared to the best-known bounds in the current literature.
Score: 2.9005223064604078
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Learning good interventions in a causal graph can be modelled as a stochastic multi-armed bandit problem with side-information. First, we study this problem when interventions are more expensive than observations and a budget is specified. If there are no backdoor paths from an intervenable node to the reward node then we propose an algorithm to minimize simple regret that optimally trades-off observations and interventions based on the cost of intervention. We also propose an algorithm that accounts for the cost of interventions, utilizes causal side-information, and minimizes the expected cumulative regret without exceeding the budget. Our cumulative-regret minimization algorithm performs better than standard algorithms that do not take side-information into account. Finally, we study the problem of learning best interventions without budget constraint in general graphs and give an algorithm that achieves constant expected cumulative regret in terms of the instance parameters when the parent distribution of the reward variable for each intervention is known. Our results are experimentally validated and compared to the best-known bounds in the current literature.

Related papers

The Minimal Search Space for Conditional Causal Bandits [0.18124328823188351]
Causal knowledge can be used to support decision-making problems. This paper presents a graphical characterization of the minimal set of nodes guaranteed to contain the optimal conditional intervention. We then propose an efficient algorithm with a time complexity of $O(|V| + |E|)$ to identify this minimal set of nodes.
arXiv Detail & Related papers (2025-02-10T15:45:18Z)
Confounded Budgeted Causal Bandits [28.199741662190203]
We study the problem of learning 'good' interventions in an environment modeled by its underlying causal graph. Good interventions refer to interventions that maximize rewards. We propose an algorithm to minimize the cumulative regret in general causal graphs.
arXiv Detail & Related papers (2024-01-15T10:26:13Z)
Best Arm Identification with Fixed Budget: A Large Deviation Perspective [54.305323903582845]
We present sred, a truly adaptive algorithm that can reject arms in it any round based on the observed empirical gaps between the rewards of various arms. In particular, we present sred, a truly adaptive algorithm that can reject arms in it any round based on the observed empirical gaps between the rewards of various arms.
arXiv Detail & Related papers (2023-12-19T13:17:43Z)
Adaptivity Complexity for Causal Graph Discovery [7.424262881242935]
We study the problem of $r$-adaptivity, where the algorithm designer recovers the causal graph under a total of $r$ sequential rounds. We provide a $r$-adaptive algorithm that achieves $O(minr,log n cdot n1/minr,log n)$ approximation with respect to the verification number.
arXiv Detail & Related papers (2023-06-09T09:49:16Z)
Stochastic Direct Search Method for Blind Resource Allocation [6.574808513848414]
We study direct search (also known as pattern search) methods for linearly constrained and derivative-free optimization. We show that direct search methods achieves finite regret in the deterministic and unconstrained case. We propose a simple extension of direct search that achieves a regret upper-bound of the order of $T2/3$.
arXiv Detail & Related papers (2022-10-11T07:40:45Z)
Quantization for decentralized learning under subspace constraints [61.59416703323886]
We consider decentralized optimization problems where agents have individual cost functions to minimize subject to subspace constraints. We propose and study an adaptive decentralized strategy where the agents employ differential randomized quantizers to compress their estimates. The analysis shows that, under some general conditions on the quantization noise, the strategy is stable both in terms of mean-square error and average bit rate.
arXiv Detail & Related papers (2022-09-16T09:38:38Z)
Safe Online Bid Optimization with Return-On-Investment and Budget Constraints subject to Uncertainty [87.81197574939355]
We study the nature of both the optimization and learning problems. We provide an algorithm, namely GCB, guaranteeing sublinear regret at the cost of a potentially linear number of constraints violations. More interestingly, we provide an algorithm, namely GCB_safe(psi,phi), guaranteeing both sublinear pseudo-regret and safety w.h.p. at the cost of accepting tolerances psi and phi.
arXiv Detail & Related papers (2022-01-18T17:24:20Z)
Causal Bandits on General Graphs [1.4502611532302039]
We study the problem of determining the best intervention in a Causal Bayesian Network (CBN) specified only by its causal graph. We propose a simple regret minimization algorithm that takes as input a semi-Markovian causal graph with atomic interventions and possibly unobservable variables. Our results indicate that the simple regret guarantee of our proposed algorithm can only be improved by considering more nuanced structural restrictions on the causal graph.
arXiv Detail & Related papers (2021-07-06T17:29:45Z)
Stochastic Hard Thresholding Algorithms for AUC Maximization [49.00683387735522]
We develop a hard thresholding algorithm for AUC in distributiond classification. We conduct experiments to show the efficiency and effectiveness of the proposed algorithms.
arXiv Detail & Related papers (2020-11-04T16:49:29Z)
An Asymptotically Optimal Primal-Dual Incremental Algorithm for Contextual Linear Bandits [129.1029690825929]
We introduce a novel algorithm improving over the state-of-the-art along multiple dimensions. We establish minimax optimality for any learning horizon in the special case of non-contextual linear bandits.
arXiv Detail & Related papers (2020-10-23T09:12:47Z)
Excursion Search for Constrained Bayesian Optimization under a Limited Budget of Failures [62.41541049302712]
We propose a novel decision maker grounded in control theory that controls the amount of risk we allow in the search as a function of a given budget of failures. Our algorithm uses the failures budget more efficiently in a variety of optimization experiments, and generally achieves lower regret, than state-of-the-art methods.
arXiv Detail & Related papers (2020-05-15T09:54:09Z)
Budget-Constrained Bandits over General Cost and Reward Distributions [32.63624728528415]
We consider a budget-constrained bandit problem where each arm pull incurs a random cost, and yields a random reward in return. We show that if moments of order $(2+gamma)$ for some $gamma > 0$ exist for all cost-reward pairs, $O(log B)$ regret is achievable for a budget $B>0$.
arXiv Detail & Related papers (2020-02-29T23:50:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.