Related papers: Stochastic Linear Contextual Bandits with Diverse Contexts

Stochastic Linear Contextual Bandits with Diverse Contexts

URL: http://arxiv.org/abs/2003.02681v1
Date: Thu, 5 Mar 2020 14:51:17 GMT
Title: Stochastic Linear Contextual Bandits with Diverse Contexts
Authors: Weiqiang Wu, Jing Yang, and Cong Shen
Abstract summary: We show that when contexts are sufficiently diverse, the learner is able to utilize the information obtained during exploitation to shorten the exploration process. We design the LinUCB-d algorithm, and propose a novel approach to analyze its regret performance.
Score: 17.35270010828849
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we investigate the impact of context diversity on stochastic linear contextual bandits. As opposed to the previous view that contexts lead to more difficult bandit learning, we show that when the contexts are sufficiently diverse, the learner is able to utilize the information obtained during exploitation to shorten the exploration process, thus achieving reduced regret. We design the LinUCB-d algorithm, and propose a novel approach to analyze its regret performance. The main theoretical result is that under the diverse context assumption, the cumulative expected regret of LinUCB-d is bounded by a constant. As a by-product, our results improve the previous understanding of LinUCB and strengthen its performance guarantee.

Related papers

Rethinking Visual Dependency in Long-Context Reasoning for Large Vision-Language Models [62.698520962933195]
Large Vision-Language Models (LVLMs) excel in cross-model tasks but experience performance declines in long-context reasoning. We propose a novel training-free context pruning method that selectively removes less critical textual information.
arXiv Detail & Related papers (2024-10-25T17:59:09Z)
Fast and Sample Efficient Multi-Task Representation Learning in Stochastic Contextual Bandits [15.342585350280535]
We study how representation learning can improve the learning efficiency of contextual bandit problems. We present a new algorithm based on alternating projected gradient descent (GD) and minimization estimator.
arXiv Detail & Related papers (2024-10-02T22:30:29Z)
Contextrast: Contextual Contrastive Learning for Semantic Segmentation [9.051352746190448]
We propose Contextrast, a contrastive learning-based semantic segmentation method. Our proposed method comprises two parts: a) contextual contrastive learning (CCL) and b) boundary-aware negative sampling. We demonstrate that our Contextrast substantially enhances the performance of semantic segmentation networks.
arXiv Detail & Related papers (2024-04-16T15:04:55Z)
Enhancing Systematic Decompositional Natural Language Inference Using Informal Logic [51.967603572656266]
We introduce a consistent and theoretically grounded approach to annotating decompositional entailment. We find that our new dataset, RDTE, has a substantially higher internal consistency (+9%) than prior decompositional entailment datasets. We also find that training an RDTE-oriented entailment classifier via knowledge distillation and employing it in an entailment tree reasoning engine significantly improves both accuracy and proof quality.
arXiv Detail & Related papers (2024-02-22T18:55:17Z)
C-ICL: Contrastive In-context Learning for Information Extraction [54.39470114243744]
c-ICL is a novel few-shot technique that leverages both correct and incorrect sample constructions to create in-context learning demonstrations. Our experiments on various datasets indicate that c-ICL outperforms previous few-shot in-context learning methods.
arXiv Detail & Related papers (2024-02-17T11:28:08Z)
Efficient Algorithms for Learning to Control Bandits with Unobserved Contexts [1.370633147306388]
We present an implementable posterior sampling algorithm for bandits with imperfect context observations. The proposed algorithm exposes efficiency in learning from the noisy imperfect observations and taking actions accordingly.
arXiv Detail & Related papers (2022-02-02T04:03:19Z)
Revisiting Contrastive Learning through the Lens of Neighborhood Component Analysis: an Integrated Framework [70.84906094606072]
We show a new methodology to design integrated contrastive losses that could simultaneously achieve good accuracy and robustness on downstream tasks. With the integrated framework, we achieve up to 6% improvement on the standard accuracy and 17% improvement on the adversarial accuracy.
arXiv Detail & Related papers (2021-12-08T18:54:11Z)
Obtaining Better Static Word Embeddings Using Contextual Embedding Models [53.86080627007695]
Our proposed distillation method is a simple extension of CBOW-based training. As a side-effect, our approach also allows a fair comparison of both contextual and static embeddings.
arXiv Detail & Related papers (2021-06-08T12:59:32Z)
Instance-Dependent Complexity of Contextual Bandits and Reinforcement Learning: A Disagreement-Based Perspective [104.67295710363679]
In the classical multi-armed bandit problem, instance-dependent algorithms attain improved performance on "easy" problems with a gap between the best and second-best arm. We introduce a family of complexity measures that are both sufficient and necessary to obtain instance-dependent regret bounds. We then introduce new oracle-efficient algorithms which adapt to the gap whenever possible, while also attaining the minimax rate in the worst case.
arXiv Detail & Related papers (2020-10-07T01:33:06Z)
Dynamic Batch Learning in High-Dimensional Sparse Linear Contextual Bandits [18.64677502651614]
We study the problem of dynamic batch learning in high-dimensional sparse linear contextual bandits. Our work provides the first inroad into a theoretical understanding of dynamic batch learning in high-dimensional sparse linear contextual bandits.
arXiv Detail & Related papers (2020-08-27T05:34:34Z)
Greedy Bandits with Sampled Context [0.0]
Greedy Bandits with Sampled Context (GB-SC) is a method for contextual multi-armed bandits to develop the prior from context information. Our results show competitive performance on the Mushroom environment in terms of expected regret and expected cumulative regret.
arXiv Detail & Related papers (2020-07-27T17:17:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.