Practical Contextual Bandits with Feedback Graphs
- URL: http://arxiv.org/abs/2302.08631v3
- Date: Fri, 27 Oct 2023 00:04:33 GMT
- Title: Practical Contextual Bandits with Feedback Graphs
- Authors: Mengxiao Zhang, Yuheng Zhang, Olga Vrousgou, Haipeng Luo, Paul Mineiro
- Abstract summary: We propose and analyze an approach to contextual bandits with feedback graphs based upon reduction to regression.
The resulting algorithms are computationally practical and achieve established minimax rates.
- Score: 44.76976254893256
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While contextual bandit has a mature theory, effectively leveraging different
feedback patterns to enhance the pace of learning remains unclear. Bandits with
feedback graphs, which interpolates between the full information and bandit
regimes, provides a promising framework to mitigate the statistical complexity
of learning. In this paper, we propose and analyze an approach to contextual
bandits with feedback graphs based upon reduction to regression. The resulting
algorithms are computationally practical and achieve established minimax rates,
thereby reducing the statistical complexity in real-world applications.
Related papers
- Mode Estimation with Partial Feedback [20.426429576184145]
We formalize core aspects of weakly supervised and active learning with a simple problem.
We show how entropy coding allows for optimal information acquisition from partial feedback.
arXiv Detail & Related papers (2024-02-20T15:24:21Z) - Efficient Contextual Bandits with Uninformed Feedback Graphs [48.77120088347271]
Bandits with feedback graphs are powerful online learning models that interpolate between the full information and classic bandit problems.
We show that it is critical to learn the graphs using log loss instead of squared loss to obtain favorable regret guarantees.
arXiv Detail & Related papers (2024-02-12T23:50:47Z) - Improved Algorithms for Bandit with Graph Feedback via Regret
Decomposition [2.3034251503343466]
The problem of bandit with graph feedback generalizes both the multi-armed bandit (MAB) problem and the learning with expert advice problem.
We propose a new algorithmic framework for the problem based on a partition of the feedback graph.
arXiv Detail & Related papers (2022-05-30T13:07:42Z) - Robust Contrastive Learning against Noisy Views [79.71880076439297]
We propose a new contrastive loss function that is robust against noisy views.
We show that our approach provides consistent improvements over the state-of-the-art image, video, and graph contrastive learning benchmarks.
arXiv Detail & Related papers (2022-01-12T05:24:29Z) - One-shot Learning for Temporal Knowledge Graphs [49.41854171118697]
We propose a one-shot learning framework for link prediction in temporal knowledge graphs.
Our proposed method employs a self-attention mechanism to effectively encode temporal interactions between entities.
Our experiments show that the proposed algorithm outperforms the state of the art baselines for two well-studied benchmarks.
arXiv Detail & Related papers (2020-10-23T03:24:44Z) - Addressing Class Imbalance in Scene Graph Parsing by Learning to
Contrast and Score [65.18522219013786]
Scene graph parsing aims to detect objects in an image scene and recognize their relations.
Recent approaches have achieved high average scores on some popular benchmarks, but fail in detecting rare relations.
This paper introduces a novel integrated framework of classification and ranking to resolve the class imbalance problem.
arXiv Detail & Related papers (2020-09-28T13:57:59Z) - Unbiased Deep Reinforcement Learning: A General Training Framework for
Existing and Future Algorithms [3.7050607140679026]
We propose a novel training framework that is conceptually comprehensible and potentially easy to be generalized to all feasible algorithms for reinforcement learning.
We employ Monte-carlo sampling to achieve raw data inputs, and train them in batch to achieve Markov decision process sequences.
We propose several algorithms embedded with our new framework to deal with typical discrete and continuous scenarios.
arXiv Detail & Related papers (2020-05-12T01:51:08Z) - Reinforcement Learning with Feedback Graphs [69.1524391595912]
We study episodic reinforcement learning in decision processes when the agent receives additional feedback per step.
We formalize this setting using a feedback graph over state-action pairs and show that model-based algorithms can leverage the additional feedback for more sample-efficient learning.
arXiv Detail & Related papers (2020-05-07T22:35:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.