Related papers: Stochastic Contextual Bandits with Graph-based Contexts

Stochastic Contextual Bandits with Graph-based Contexts

URL: http://arxiv.org/abs/2305.01470v1
Date: Tue, 2 May 2023 14:51:35 GMT
Title: Stochastic Contextual Bandits with Graph-based Contexts
Authors: Jittat Fakcharoenphol and Chayutpong Prompak
Abstract summary: We generalize the on-line graph prediction problem to a version of contextual bandit problems. Our algorithm relies on the optimal bandit algorithm by Zimmert and Seldin[AISTAT'19, JMLR'21].
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We naturally generalize the on-line graph prediction problem to a version of stochastic contextual bandit problems where contexts are vertices in a graph and the structure of the graph provides information on the similarity of contexts. More specifically, we are given a graph $G=(V,E)$, whose vertex set $V$ represents contexts with {\em unknown} vertex label $y$. In our stochastic contextual bandit setting, vertices with the same label share the same reward distribution. The standard notion of instance difficulties in graph label prediction is the cutsize $f$ defined to be the number of edges whose end points having different labels. For line graphs and trees we present an algorithm with regret bound of $\tilde{O}(T^{2/3}K^{1/3}f^{1/3})$ where $K$ is the number of arms. Our algorithm relies on the optimal stochastic bandit algorithm by Zimmert and Seldin~[AISTAT'19, JMLR'21]. When the best arm outperforms the other arms, the regret improves to $\tilde{O}(\sqrt{KT\cdot f})$. The regret bound in the later case is comparable to other optimal contextual bandit results in more general cases, but our algorithm is easy to analyze, runs very efficiently, and does not require an i.i.d. assumption on the input context sequence. The algorithm also works with general graphs using a standard random spanning tree reduction.

Related papers

Graph Unfolding and Sampling for Transitory Video Summarization via Gershgorin Disc Alignment [48.137527345353625]
User-generated videos (UGVs) uploaded from mobile phones to social media sites like YouTube and TikTok are short and non-repetitive. We summarize a transitory UGV into several discs in linear time via fast graph sampling based on Gershgorin disc alignment (GDA) We show that our algorithm achieves comparable or better video summarization performance compared to state-of-the-art methods, at a substantially reduced complexity.
arXiv Detail & Related papers (2024-08-03T20:08:02Z)
Online Learning with Feedback Graphs: The True Shape of Regret [82.00098840619847]
We prove that the minimax regret is proportional to $R*$ for any graph and time horizon $T$. Introducing an intricate exploration strategy, we define the mainAlgorithm algorithm that achieves the minimax optimal regret bound.
arXiv Detail & Related papers (2023-06-05T15:35:00Z)
On the Unlikelihood of D-Separation [69.62839677485087]
We provide analytic evidence that on large graphs, d-separation is a rare phenomenon, even when guaranteed to exist. For the PC Algorithm, while it is known that its worst-case guarantees fail on non-sparse graphs, we show that the same is true for the average case. For UniformSGS, while it is known that the running time is exponential for existing edges, we show that in the average case, that is the expected running time for most non-existing edges as well.
arXiv Detail & Related papers (2023-03-10T00:11:18Z)
Improved High-Probability Regret for Adversarial Bandits with Time-Varying Feedback Graphs [62.52390282012508]
We study high-probability regret bounds for adversarial $K$-armed bandits with time-varying feedback graphs over $T$ rounds. We develop an algorithm that achieves the optimal regret $widetildemathcalO((sum_t=1Talpha_t)1/2+max_tin[T]alpha_t]$ with high probability. We also develop the first algorithm that achieves the optimal high-probability regret bound for weakly observable graphs.
arXiv Detail & Related papers (2022-10-04T04:36:15Z)
Random graph matching at Otter's threshold via counting chandeliers [16.512416293014493]
We propose an efficient algorithm for graph matching based on similarity scores constructed from counting a certain family of weighted trees rooted at each. This is the first graph matching algorithm that succeeds at an explicit constant correlation and applies to both sparse and dense graphs.
arXiv Detail & Related papers (2022-09-25T20:00:28Z)
A Near-Optimal Best-of-Both-Worlds Algorithm for Online Learning with Feedback Graphs [21.563733343861713]
We consider online learning with feedback graphs, a sequential decision-making framework where the learner's feedback is determined by a directed graph over the action set. We present a computationally efficient algorithm for learning in this framework that simultaneously achieves near-optimal regret bounds in both and adversarial environments.
arXiv Detail & Related papers (2022-06-01T15:14:32Z)
Fast Graph Sampling for Short Video Summarization using Gershgorin Disc Alignment [52.577757919003844]
We study the problem of efficiently summarizing a short video into several paragraphs, leveraging recent progress in fast graph sampling. Experimental results show that our algorithm achieves comparable video summarization as state-of-the-art methods, at a substantially reduced complexity.
arXiv Detail & Related papers (2021-10-21T18:43:00Z)
Random Graph Matching with Improved Noise Robustness [2.294014185517203]
We propose a new algorithm for graph matching under probabilistic models. Our algorithm recovers the underlying matching with high probability when $alpha le 1 / (log log n)C$. This improves the condition $alpha le 1 / (log n)C$ achieved in previous work.
arXiv Detail & Related papers (2021-01-28T02:39:27Z)
Adversarial Linear Contextual Bandits with Graph-Structured Side Observations [80.95090605985042]
A learning agent repeatedly chooses from a set of $K$ actions after being presented with a $d$-dimensional context vector. The agent incurs and observes the loss of the chosen action, but also observes the losses of its neighboring actions in the observation structures. Two efficient algorithms are developed based on textttEXP3.
arXiv Detail & Related papers (2020-12-10T15:40:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.