Related papers: Practical Contextual Bandits with Feedback Graphs

Practical Contextual Bandits with Feedback Graphs

URL: http://arxiv.org/abs/2302.08631v3
Date: Fri, 27 Oct 2023 00:04:33 GMT
Title: Practical Contextual Bandits with Feedback Graphs
Authors: Mengxiao Zhang, Yuheng Zhang, Olga Vrousgou, Haipeng Luo, Paul Mineiro
Abstract summary: We propose and analyze an approach to contextual bandits with feedback graphs based upon reduction to regression. The resulting algorithms are computationally practical and achieve established minimax rates.
Score: 44.76976254893256
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: While contextual bandit has a mature theory, effectively leveraging different feedback patterns to enhance the pace of learning remains unclear. Bandits with feedback graphs, which interpolates between the full information and bandit regimes, provides a promising framework to mitigate the statistical complexity of learning. In this paper, we propose and analyze an approach to contextual bandits with feedback graphs based upon reduction to regression. The resulting algorithms are computationally practical and achieve established minimax rates, thereby reducing the statistical complexity in real-world applications.

Related papers

Small steps no more: Global convergence of stochastic gradient bandits for arbitrary learning rates [61.091122503406304]
We show that the gradient bandit algorithm converges to a globally optimal policy almost surely using emphany constant learning rate. This result demonstrates that gradient algorithm continues to balance exploration and exploitation appropriately even in scenarios where standard smoothness and noise control assumptions break down.
arXiv Detail & Related papers (2025-02-11T00:12:04Z)
Mode Estimation with Partial Feedback [20.426429576184145]
We formalize core aspects of weakly supervised and active learning with a simple problem. We show how entropy coding allows for optimal information acquisition from partial feedback.
arXiv Detail & Related papers (2024-02-20T15:24:21Z)
Efficient Contextual Bandits with Uninformed Feedback Graphs [48.77120088347271]
Bandits with feedback graphs are powerful online learning models that interpolate between the full information and classic bandit problems. We show that it is critical to learn the graphs using log loss instead of squared loss to obtain favorable regret guarantees.
arXiv Detail & Related papers (2024-02-12T23:50:47Z)
Improved Algorithms for Bandit with Graph Feedback via Regret Decomposition [2.3034251503343466]
The problem of bandit with graph feedback generalizes both the multi-armed bandit (MAB) problem and the learning with expert advice problem. We propose a new algorithmic framework for the problem based on a partition of the feedback graph.
arXiv Detail & Related papers (2022-05-30T13:07:42Z)
Robust Contrastive Learning against Noisy Views [79.71880076439297]
We propose a new contrastive loss function that is robust against noisy views. We show that our approach provides consistent improvements over the state-of-the-art image, video, and graph contrastive learning benchmarks.
arXiv Detail & Related papers (2022-01-12T05:24:29Z)
One-shot Learning for Temporal Knowledge Graphs [49.41854171118697]
We propose a one-shot learning framework for link prediction in temporal knowledge graphs. Our proposed method employs a self-attention mechanism to effectively encode temporal interactions between entities. Our experiments show that the proposed algorithm outperforms the state of the art baselines for two well-studied benchmarks.
arXiv Detail & Related papers (2020-10-23T03:24:44Z)
Addressing Class Imbalance in Scene Graph Parsing by Learning to Contrast and Score [65.18522219013786]
Scene graph parsing aims to detect objects in an image scene and recognize their relations. Recent approaches have achieved high average scores on some popular benchmarks, but fail in detecting rare relations. This paper introduces a novel integrated framework of classification and ranking to resolve the class imbalance problem.
arXiv Detail & Related papers (2020-09-28T13:57:59Z)
Unbiased Deep Reinforcement Learning: A General Training Framework for Existing and Future Algorithms [3.7050607140679026]
We propose a novel training framework that is conceptually comprehensible and potentially easy to be generalized to all feasible algorithms for reinforcement learning. We employ Monte-carlo sampling to achieve raw data inputs, and train them in batch to achieve Markov decision process sequences. We propose several algorithms embedded with our new framework to deal with typical discrete and continuous scenarios.
arXiv Detail & Related papers (2020-05-12T01:51:08Z)
Reinforcement Learning with Feedback Graphs [69.1524391595912]
We study episodic reinforcement learning in decision processes when the agent receives additional feedback per step. We formalize this setting using a feedback graph over state-action pairs and show that model-based algorithms can leverage the additional feedback for more sample-efficient learning.
arXiv Detail & Related papers (2020-05-07T22:35:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.