Stochastic Contextual Bandits with Graph-based Contexts
- URL: http://arxiv.org/abs/2305.01470v1
- Date: Tue, 2 May 2023 14:51:35 GMT
- Title: Stochastic Contextual Bandits with Graph-based Contexts
- Authors: Jittat Fakcharoenphol and Chayutpong Prompak
- Abstract summary: We generalize the on-line graph prediction problem to a version of contextual bandit problems.
Our algorithm relies on the optimal bandit algorithm by Zimmert and Seldin[AISTAT'19, JMLR'21].
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We naturally generalize the on-line graph prediction problem to a version of
stochastic contextual bandit problems where contexts are vertices in a graph
and the structure of the graph provides information on the similarity of
contexts. More specifically, we are given a graph $G=(V,E)$, whose vertex set
$V$ represents contexts with {\em unknown} vertex label $y$. In our stochastic
contextual bandit setting, vertices with the same label share the same reward
distribution. The standard notion of instance difficulties in graph label
prediction is the cutsize $f$ defined to be the number of edges whose end
points having different labels. For line graphs and trees we present an
algorithm with regret bound of $\tilde{O}(T^{2/3}K^{1/3}f^{1/3})$ where $K$ is
the number of arms. Our algorithm relies on the optimal stochastic bandit
algorithm by Zimmert and Seldin~[AISTAT'19, JMLR'21]. When the best arm
outperforms the other arms, the regret improves to $\tilde{O}(\sqrt{KT\cdot
f})$. The regret bound in the later case is comparable to other optimal
contextual bandit results in more general cases, but our algorithm is easy to
analyze, runs very efficiently, and does not require an i.i.d. assumption on
the input context sequence. The algorithm also works with general graphs using
a standard random spanning tree reduction.
Related papers
- Graph Unfolding and Sampling for Transitory Video Summarization via Gershgorin Disc Alignment [48.137527345353625]
User-generated videos (UGVs) uploaded from mobile phones to social media sites like YouTube and TikTok are short and non-repetitive.
We summarize a transitory UGV into several discs in linear time via fast graph sampling based on Gershgorin disc alignment (GDA)
We show that our algorithm achieves comparable or better video summarization performance compared to state-of-the-art methods, at a substantially reduced complexity.
arXiv Detail & Related papers (2024-08-03T20:08:02Z) - Online Learning with Feedback Graphs: The True Shape of Regret [82.00098840619847]
We prove that the minimax regret is proportional to $R*$ for any graph and time horizon $T$.
Introducing an intricate exploration strategy, we define the mainAlgorithm algorithm that achieves the minimax optimal regret bound.
arXiv Detail & Related papers (2023-06-05T15:35:00Z) - On the Unlikelihood of D-Separation [69.62839677485087]
We provide analytic evidence that on large graphs, d-separation is a rare phenomenon, even when guaranteed to exist.
For the PC Algorithm, while it is known that its worst-case guarantees fail on non-sparse graphs, we show that the same is true for the average case.
For UniformSGS, while it is known that the running time is exponential for existing edges, we show that in the average case, that is the expected running time for most non-existing edges as well.
arXiv Detail & Related papers (2023-03-10T00:11:18Z) - Improved High-Probability Regret for Adversarial Bandits with
Time-Varying Feedback Graphs [62.52390282012508]
We study high-probability regret bounds for adversarial $K$-armed bandits with time-varying feedback graphs over $T$ rounds.
We develop an algorithm that achieves the optimal regret $widetildemathcalO((sum_t=1Talpha_t)1/2+max_tin[T]alpha_t]$ with high probability.
We also develop the first algorithm that achieves the optimal high-probability regret bound for weakly observable graphs.
arXiv Detail & Related papers (2022-10-04T04:36:15Z) - Random graph matching at Otter's threshold via counting chandeliers [16.512416293014493]
We propose an efficient algorithm for graph matching based on similarity scores constructed from counting a certain family of weighted trees rooted at each.
This is the first graph matching algorithm that succeeds at an explicit constant correlation and applies to both sparse and dense graphs.
arXiv Detail & Related papers (2022-09-25T20:00:28Z) - A Near-Optimal Best-of-Both-Worlds Algorithm for Online Learning with
Feedback Graphs [21.563733343861713]
We consider online learning with feedback graphs, a sequential decision-making framework where the learner's feedback is determined by a directed graph over the action set.
We present a computationally efficient algorithm for learning in this framework that simultaneously achieves near-optimal regret bounds in both and adversarial environments.
arXiv Detail & Related papers (2022-06-01T15:14:32Z) - Fast Graph Sampling for Short Video Summarization using Gershgorin Disc
Alignment [52.577757919003844]
We study the problem of efficiently summarizing a short video into several paragraphs, leveraging recent progress in fast graph sampling.
Experimental results show that our algorithm achieves comparable video summarization as state-of-the-art methods, at a substantially reduced complexity.
arXiv Detail & Related papers (2021-10-21T18:43:00Z) - Random Graph Matching with Improved Noise Robustness [2.294014185517203]
We propose a new algorithm for graph matching under probabilistic models.
Our algorithm recovers the underlying matching with high probability when $alpha le 1 / (log log n)C$.
This improves the condition $alpha le 1 / (log n)C$ achieved in previous work.
arXiv Detail & Related papers (2021-01-28T02:39:27Z) - Adversarial Linear Contextual Bandits with Graph-Structured Side
Observations [80.95090605985042]
A learning agent repeatedly chooses from a set of $K$ actions after being presented with a $d$-dimensional context vector.
The agent incurs and observes the loss of the chosen action, but also observes the losses of its neighboring actions in the observation structures.
Two efficient algorithms are developed based on textttEXP3.
arXiv Detail & Related papers (2020-12-10T15:40:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.