Provably Efficient Learning in Partially Observable Contextual Bandit
- URL: http://arxiv.org/abs/2308.03572v2
- Date: Mon, 4 Sep 2023 11:41:54 GMT
- Title: Provably Efficient Learning in Partially Observable Contextual Bandit
- Authors: Xueping Gong and Jiheng Zhang
- Abstract summary: We show how causal bounds can be applied to improving classical bandit algorithms.
This research has the potential to enhance the performance of contextual bandit agents in real-world applications.
- Score: 4.910658441596583
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: In this paper, we investigate transfer learning in partially observable
contextual bandits, where agents have limited knowledge from other agents and
partial information about hidden confounders. We first convert the problem to
identifying or partially identifying causal effects between actions and rewards
through optimization problems. To solve these optimization problems, we
discretize the original functional constraints of unknown distributions into
linear constraints, and sample compatible causal models via sequentially
solving linear programmings to obtain causal bounds with the consideration of
estimation error. Our sampling algorithms provide desirable convergence results
for suitable sampling distributions. We then show how causal bounds can be
applied to improving classical bandit algorithms and affect the regrets with
respect to the size of action sets and function spaces. Notably, in the task
with function approximation which allows us to handle general context
distributions, our method improves the order dependence on function space size
compared with previous literatures. We formally prove that our causally
enhanced algorithms outperform classical bandit algorithms and achieve orders
of magnitude faster convergence rates. Finally, we perform simulations that
demonstrate the efficiency of our strategy compared to the current
state-of-the-art methods. This research has the potential to enhance the
performance of contextual bandit agents in real-world applications where data
is scarce and costly to obtain.
Related papers
- Efficient Differentiable Discovery of Causal Order [14.980926991441342]
Intersort is a score-based method to discover causal order of variables.
We reformulate Intersort using differentiable sorting and ranking techniques.
Our work opens the door to efficiently incorporating regularization for causal order into the training of differentiable models.
arXiv Detail & Related papers (2024-10-11T13:11:55Z) - Multi-modal Causal Structure Learning and Root Cause Analysis [67.67578590390907]
We propose Mulan, a unified multi-modal causal structure learning method for root cause localization.
We leverage a log-tailored language model to facilitate log representation learning, converting log sequences into time-series data.
We also introduce a novel key performance indicator-aware attention mechanism for assessing modality reliability and co-learning a final causal graph.
arXiv Detail & Related papers (2024-02-04T05:50:38Z) - Interactive Graph Convolutional Filtering [79.34979767405979]
Interactive Recommender Systems (IRS) have been increasingly used in various domains, including personalized article recommendation, social media, and online advertising.
These problems are exacerbated by the cold start problem and data sparsity problem.
Existing Multi-Armed Bandit methods, despite their carefully designed exploration strategies, often struggle to provide satisfactory results in the early stages.
Our proposed method extends interactive collaborative filtering into the graph model to enhance the performance of collaborative filtering between users and items.
arXiv Detail & Related papers (2023-09-04T09:02:31Z) - Learning Prompt-Enhanced Context Features for Weakly-Supervised Video
Anomaly Detection [37.99031842449251]
Video anomaly detection under weak supervision presents significant challenges.
We present a weakly supervised anomaly detection framework that focuses on efficient context modeling and enhanced semantic discriminability.
Our approach significantly improves the detection accuracy of certain anomaly sub-classes, underscoring its practical value and efficacy.
arXiv Detail & Related papers (2023-06-26T06:45:16Z) - dugMatting: Decomposed-Uncertainty-Guided Matting [83.71273621169404]
We propose a decomposed-uncertainty-guided matting algorithm, which explores the explicitly decomposed uncertainties to efficiently and effectively improve the results.
The proposed matting framework relieves the requirement for users to determine the interaction areas by using simple and efficient labeling.
arXiv Detail & Related papers (2023-06-02T11:19:50Z) - On data-driven chance constraint learning for mixed-integer optimization
problems [0.0]
We develop a Chance Constraint Learning (CCL) methodology with a focus on mixed-integer linear optimization problems.
CCL makes use of linearizable machine learning models to estimate conditional quantiles of the learned variables.
An open-access software has been developed to be used by practitioners.
arXiv Detail & Related papers (2022-07-08T11:54:39Z) - Fusion and Orthogonal Projection for Improved Face-Voice Association [15.938463726577128]
We study the problem of learning association between face and voice.
We propose a light-weight, plug-and-play mechanism that exploits the complementary cues in both modalities to form enriched fused embeddings.
arXiv Detail & Related papers (2021-12-20T12:33:33Z) - Accurate and Robust Feature Importance Estimation under Distribution
Shifts [49.58991359544005]
PRoFILE is a novel feature importance estimation method.
We show significant improvements over state-of-the-art approaches, both in terms of fidelity and robustness.
arXiv Detail & Related papers (2020-09-30T05:29:01Z) - Differentiable Causal Discovery from Interventional Data [141.41931444927184]
We propose a theoretically-grounded method based on neural networks that can leverage interventional data.
We show that our approach compares favorably to the state of the art in a variety of settings.
arXiv Detail & Related papers (2020-07-03T15:19:17Z) - Task-Feature Collaborative Learning with Application to Personalized
Attribute Prediction [166.87111665908333]
We propose a novel multi-task learning method called Task-Feature Collaborative Learning (TFCL)
Specifically, we first propose a base model with a heterogeneous block-diagonal structure regularizer to leverage the collaborative grouping of features and tasks.
As a practical extension, we extend the base model by allowing overlapping features and differentiating the hard tasks.
arXiv Detail & Related papers (2020-04-29T02:32:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.