Related papers: Stochastic Graph Bandit Learning with Side-Observations

Stochastic Graph Bandit Learning with Side-Observations

URL: http://arxiv.org/abs/2308.15107v2
Date: Sat, 6 Jan 2024 16:17:24 GMT
Title: Stochastic Graph Bandit Learning with Side-Observations
Authors: Xueping Gong and Jiheng Zhang
Abstract summary: We propose an algorithm that adapts to both the underlying graph structures and reward gaps. To the best of our knowledge, our algorithm is the first to provide a gap-dependent upper bound in this setting.
Score: 4.910658441596583
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: In this paper, we investigate the stochastic contextual bandit with general function space and graph feedback. We propose an algorithm that addresses this problem by adapting to both the underlying graph structures and reward gaps. To the best of our knowledge, our algorithm is the first to provide a gap-dependent upper bound in this stochastic setting, bridging the research gap left by the work in [35]. In comparison to [31,33,35], our method offers improved regret upper bounds and does not require knowledge of graphical quantities. We conduct numerical experiments to demonstrate the computational efficiency and effectiveness of our approach in terms of regret upper bounds. These findings highlight the significance of our algorithm in advancing the field of stochastic contextual bandits with graph feedback, opening up avenues for practical applications in various domains.

Related papers

Asymptotically Optimal Linear Best Feasible Arm Identification with Fixed Budget [55.938644481736446]
We introduce a novel algorithm for best feasible arm identification that guarantees an exponential decay in the error probability.<n>We validate our algorithm through comprehensive empirical evaluations across various problem instances with different levels of complexity.
arXiv Detail & Related papers (2025-06-03T02:56:26Z)
Graph-Dependent Regret Bounds in Multi-Armed Bandits with Interference [18.101059380671582]
We study multi-armed bandits under network interference.<n>This induces an exponentially large action space.<n>We propose a novel algorithm that uses the local graph structure to minimize regret.
arXiv Detail & Related papers (2025-03-10T17:25:24Z)
Improved Graph-based semi-supervised learning Schemes [0.0]
In this work, we improve the accuracy of several known algorithms to address the classification of large datasets when few labels are available. Our framework lies in the realm of graph-based semi-supervised learning.
arXiv Detail & Related papers (2024-06-30T16:50:08Z)
The graph alignment problem: fundamental limits and efficient algorithms [0.9246334723892301]
The noisy version of the graph isomorphism problem aims to find a matching between the nodes of two graphs which preserves most of the edges. This thesis focuses on understanding the fundamental information-theoretical limits for this problem, as well as designing and analyzing algorithms that are able to recover the underlying alignment in the data.
arXiv Detail & Related papers (2024-04-18T15:31:13Z)
Learning-Augmented Algorithms with Explicit Predictors [67.02156211760415]
Recent advances in algorithmic design show how to utilize predictions obtained by machine learning models from past and present data. Prior research in this context was focused on a paradigm where the predictor is pre-trained on past data and then used as a black box. In this work, we unpack the predictor and integrate the learning problem it gives rise for within the algorithmic challenge.
arXiv Detail & Related papers (2024-03-12T08:40:21Z)
Efficient Contextual Bandits with Uninformed Feedback Graphs [48.77120088347271]
Bandits with feedback graphs are powerful online learning models that interpolate between the full information and classic bandit problems. We show that it is critical to learn the graphs using log loss instead of squared loss to obtain favorable regret guarantees.
arXiv Detail & Related papers (2024-02-12T23:50:47Z)
Provably Efficient Learning in Partially Observable Contextual Bandit [4.910658441596583]
We show how causal bounds can be applied to improving classical bandit algorithms. This research has the potential to enhance the performance of contextual bandit agents in real-world applications.
arXiv Detail & Related papers (2023-08-07T13:24:50Z)
Lexicographic Multi-Objective Reinforcement Learning [65.90380946224869]
We present a family of both action-value and policy gradient algorithms that can be used to solve such problems. We show how our algorithms can be used to impose safety constraints on the behaviour of an agent, and compare their performance in this context with that of other constrained reinforcement learning algorithms.
arXiv Detail & Related papers (2022-12-28T10:22:36Z)
Joint graph learning from Gaussian observations in the presence of hidden nodes [26.133725549667734]
We propose a joint graph learning method that takes into account the presence of hidden (latent) variables. We exploit the structure resulting from the previous considerations to propose a convex optimization problem. We compare the proposed algorithm with different baselines and evaluate its performance over synthetic and real-world graphs.
arXiv Detail & Related papers (2022-12-04T13:03:41Z)
An $\alpha$-No-Regret Algorithm For Graphical Bilinear Bandits [15.29268368415036]
We propose the first regret-based approach to the Graphical Bilinear Bandits problem. We present the first regret-based algorithm for bilinear bandits using the principle of optimism in the face of uncertainty.
arXiv Detail & Related papers (2022-06-01T12:55:17Z)
Bayesian Graph Contrastive Learning [55.36652660268726]
We propose a novel perspective of graph contrastive learning methods showing random augmentations leads to encoders. Our proposed method represents each node by a distribution in the latent space in contrast to existing techniques which embed each node to a deterministic vector. We show a considerable improvement in performance compared to existing state-of-the-art methods on several benchmark datasets.
arXiv Detail & Related papers (2021-12-15T01:45:32Z)
Asymptotic study of stochastic adaptive algorithm in non-convex landscape [2.1320960069210484]
This paper studies some assumption properties of adaptive algorithms widely used in optimization and machine learning. Among them Adagrad and Rmsprop, which are involved in most of the blackbox deep learning algorithms.
arXiv Detail & Related papers (2020-12-10T12:54:45Z)
An Asymptotically Optimal Primal-Dual Incremental Algorithm for Contextual Linear Bandits [129.1029690825929]
We introduce a novel algorithm improving over the state-of-the-art along multiple dimensions. We establish minimax optimality for any learning horizon in the special case of non-contextual linear bandits.
arXiv Detail & Related papers (2020-10-23T09:12:47Z)
Online Dense Subgraph Discovery via Blurred-Graph Feedback [87.9850024070244]
We introduce a novel learning problem for dense subgraph discovery. We first propose a edge-time algorithm that obtains a nearly-optimal solution with high probability. We then design a more scalable algorithm with a theoretical guarantee.
arXiv Detail & Related papers (2020-06-24T11:37:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.