Transfer Learning with Partially Observable Offline Data via Causal Bounds
- URL: http://arxiv.org/abs/2308.03572v4
- Date: Fri, 03 Jan 2025 18:43:00 GMT
- Title: Transfer Learning with Partially Observable Offline Data via Causal Bounds
- Authors: Xueping Gong, Wei You, Jiheng Zhang,
- Abstract summary: In this paper, we investigate transfer learning in partially observable contextual bandits.
Agents operate with incomplete information and limited access to hidden confounders.
We propose an efficient method that discretizes the functional constraints of unknown distributions into linear constraints.
This method takes into account estimation errors and exhibits strong convergence properties, ensuring robust and reliable causal bounds.
- Score: 8.981637739384674
- License:
- Abstract: Transfer learning has emerged as an effective approach to accelerate learning by integrating knowledge from related source agents. However, challenges arise due to data heterogeneity-such as differences in feature sets or incomplete datasets-which often results in the nonidentifiability of causal effects. In this paper, we investigate transfer learning in partially observable contextual bandits, where agents operate with incomplete information and limited access to hidden confounders. To address the challenges posed by unobserved confounders, we formulate optimization problems to derive tight bounds on the nonidentifiable causal effects. We then propose an efficient method that discretizes the functional constraints of unknown distributions into linear constraints, allowing us to sample compatible causal models through a sequential process of solving linear programs. This method takes into account estimation errors and exhibits strong convergence properties, ensuring robust and reliable causal bounds. Leveraging these causal bounds, we improve classical bandit algorithms, achieving tighter regret upper and lower bounds relative to the sizes of action sets and function spaces. In tasks involving function approximation, which are crucial for handling complex context spaces, our method significantly improves the dependence on function space size compared to previous work. We formally prove that our causally enhanced algorithms outperform classical bandit algorithms, achieving notably faster convergence rates. The applicability of our approach is further illustrated through an example of offline pricing policy learning with censored demand. Simulations confirm the superiority of our approach over state-of-the-art methods, demonstrating its potential to enhance contextual bandit agents in real-world applications, especially when data is scarce, costly, or restricted due to privacy concerns.
Related papers
- Efficient Differentiable Discovery of Causal Order [14.980926991441342]
Intersort is a score-based method to discover causal order of variables.
We reformulate Intersort using differentiable sorting and ranking techniques.
Our work opens the door to efficiently incorporating regularization for causal order into the training of differentiable models.
arXiv Detail & Related papers (2024-10-11T13:11:55Z) - Multi-modal Causal Structure Learning and Root Cause Analysis [67.67578590390907]
We propose Mulan, a unified multi-modal causal structure learning method for root cause localization.
We leverage a log-tailored language model to facilitate log representation learning, converting log sequences into time-series data.
We also introduce a novel key performance indicator-aware attention mechanism for assessing modality reliability and co-learning a final causal graph.
arXiv Detail & Related papers (2024-02-04T05:50:38Z) - Interactive Graph Convolutional Filtering [79.34979767405979]
Interactive Recommender Systems (IRS) have been increasingly used in various domains, including personalized article recommendation, social media, and online advertising.
These problems are exacerbated by the cold start problem and data sparsity problem.
Existing Multi-Armed Bandit methods, despite their carefully designed exploration strategies, often struggle to provide satisfactory results in the early stages.
Our proposed method extends interactive collaborative filtering into the graph model to enhance the performance of collaborative filtering between users and items.
arXiv Detail & Related papers (2023-09-04T09:02:31Z) - Learning Prompt-Enhanced Context Features for Weakly-Supervised Video
Anomaly Detection [37.99031842449251]
Video anomaly detection under weak supervision presents significant challenges.
We present a weakly supervised anomaly detection framework that focuses on efficient context modeling and enhanced semantic discriminability.
Our approach significantly improves the detection accuracy of certain anomaly sub-classes, underscoring its practical value and efficacy.
arXiv Detail & Related papers (2023-06-26T06:45:16Z) - dugMatting: Decomposed-Uncertainty-Guided Matting [83.71273621169404]
We propose a decomposed-uncertainty-guided matting algorithm, which explores the explicitly decomposed uncertainties to efficiently and effectively improve the results.
The proposed matting framework relieves the requirement for users to determine the interaction areas by using simple and efficient labeling.
arXiv Detail & Related papers (2023-06-02T11:19:50Z) - On data-driven chance constraint learning for mixed-integer optimization
problems [0.0]
We develop a Chance Constraint Learning (CCL) methodology with a focus on mixed-integer linear optimization problems.
CCL makes use of linearizable machine learning models to estimate conditional quantiles of the learned variables.
An open-access software has been developed to be used by practitioners.
arXiv Detail & Related papers (2022-07-08T11:54:39Z) - Fusion and Orthogonal Projection for Improved Face-Voice Association [15.938463726577128]
We study the problem of learning association between face and voice.
We propose a light-weight, plug-and-play mechanism that exploits the complementary cues in both modalities to form enriched fused embeddings.
arXiv Detail & Related papers (2021-12-20T12:33:33Z) - Accurate and Robust Feature Importance Estimation under Distribution
Shifts [49.58991359544005]
PRoFILE is a novel feature importance estimation method.
We show significant improvements over state-of-the-art approaches, both in terms of fidelity and robustness.
arXiv Detail & Related papers (2020-09-30T05:29:01Z) - Differentiable Causal Discovery from Interventional Data [141.41931444927184]
We propose a theoretically-grounded method based on neural networks that can leverage interventional data.
We show that our approach compares favorably to the state of the art in a variety of settings.
arXiv Detail & Related papers (2020-07-03T15:19:17Z) - Task-Feature Collaborative Learning with Application to Personalized
Attribute Prediction [166.87111665908333]
We propose a novel multi-task learning method called Task-Feature Collaborative Learning (TFCL)
Specifically, we first propose a base model with a heterogeneous block-diagonal structure regularizer to leverage the collaborative grouping of features and tasks.
As a practical extension, we extend the base model by allowing overlapping features and differentiating the hard tasks.
arXiv Detail & Related papers (2020-04-29T02:32:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.