Related papers: Pure Exploration and Regret Minimization in Matching Bandits

Pure Exploration and Regret Minimization in Matching Bandits

URL: http://arxiv.org/abs/2108.00230v1
Date: Sat, 31 Jul 2021 12:37:51 GMT
Title: Pure Exploration and Regret Minimization in Matching Bandits
Authors: Flore Sentenac, Jialin Yi, Cl\'ement Calauz\`enes, Vianney Perchet, Milan Vojnovic
Abstract summary: We prove that it is possible to leverage a rank-1 assumption on the adjacency matrix to reduce the sample complexity. Finding an optimal matching in a weighted graph is a standard problem.
Score: 19.205538784019936
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Finding an optimal matching in a weighted graph is a standard combinatorial problem. We consider its semi-bandit version where either a pair or a full matching is sampled sequentially. We prove that it is possible to leverage a rank-1 assumption on the adjacency matrix to reduce the sample complexity and the regret of off-the-shelf algorithms up to reaching a linear dependency in the number of vertices (up to poly log terms).

Related papers

Nearly Optimal Sample Complexity of Offline KL-Regularized Contextual Bandits under Single-Policy Concentrability [49.96531901205305]
We propose the emphfirst algorithm with $tildeO(epsilon-1)$ sample complexity under single-policy concentrability for offline contextual bandits. Our proof leverages the strong convexity of the KL regularization, and the conditional non-negativity of the gap between the true reward and its pessimistic estimator. We extend our algorithm to contextual dueling bandits and achieve a similar nearly optimal sample complexity.
arXiv Detail & Related papers (2025-02-09T22:14:45Z)
An SDP-based Branch-and-Cut Algorithm for Biclustering [0.0]
We present a tailored branch-and-cut algorithm for biclustering problems. We show that the proposed algorithm can solve instances 20 times larger than those handled by general-purpose solvers.
arXiv Detail & Related papers (2024-03-17T21:43:19Z)
Entry-Specific Bounds for Low-Rank Matrix Completion under Highly Non-Uniform Sampling [10.824999179337558]
We show that it is often better and sometimes optimal to run estimation algorithms on a smaller submatrix rather than the entire matrix. Our bounds characterize the hardness of estimating each entry as a function of the localized sampling probabilities.
arXiv Detail & Related papers (2024-02-29T23:24:43Z)
An Optimal Algorithm for the Real-Valued Combinatorial Pure Exploration of Multi-Armed Bandit [65.268245109828]
We study the real-valued pure exploration problem in the multi-armed bandit (R-CPE-MAB) Existing methods in the R-CPE-MAB can be seen as a special case of the so-called transductive linear bandits. We propose an algorithm named the gap-based exploration (CombGapE) algorithm, whose sample complexity matches the lower bound.
arXiv Detail & Related papers (2023-06-15T15:37:31Z)
Optimal PAC Bounds Without Uniform Convergence [11.125968799758436]
We provide optimal high probability risk bounds through a framework that surpasses the limitations of uniform convergence arguments. Our framework converts the leave-one-out error of permutation invariant predictors into high probability risk bounds. Specifically, our result shows that certain aggregations of one-inclusion graph algorithms are optimal.
arXiv Detail & Related papers (2023-04-18T17:57:31Z)
Optimal Algorithms for Stochastic Complementary Composite Minimization [55.26935605535377]
Inspired by regularization techniques in statistics and machine learning, we study complementary composite minimization. We provide novel excess risk bounds, both in expectation and with high probability. Our algorithms are nearly optimal, which we prove via novel lower complexity bounds for this class of problems.
arXiv Detail & Related papers (2022-11-03T12:40:24Z)
Optimal Clustering with Bandit Feedback [57.672609011609886]
This paper considers the problem of online clustering with bandit feedback. It includes a novel stopping rule for sequential testing that circumvents the need to solve any NP-hard weighted clustering problem as its subroutines. We show through extensive simulations on synthetic and real-world datasets that BOC's performance matches the lower boundally, and significantly outperforms a non-adaptive baseline algorithm.
arXiv Detail & Related papers (2022-02-09T06:05:05Z)
On the Complexity of a Practical Primal-Dual Coordinate Method [63.899427212054995]
We prove complexity bounds for primal-dual algorithm with random and coordinate descent (PURE-CD) It has been shown to obtain good extrapolation for solving bi-max performance problems.
arXiv Detail & Related papers (2022-01-19T16:14:27Z)
Deep Probabilistic Graph Matching [72.6690550634166]
We propose a deep learning-based graph matching framework that works for the original QAP without compromising on the matching constraints. The proposed method is evaluated on three popularly tested benchmarks (Pascal VOC, Willow Object and SPair-71k) and it outperforms all previous state-of-the-arts on all benchmarks.
arXiv Detail & Related papers (2022-01-05T13:37:27Z)
Mean-based Best Arm Identification in Stochastic Bandits under Reward Contamination [80.53485617514707]
This paper proposes two algorithms, a gap-based algorithm and one based on the successive elimination, for best arm identification in sub-Gaussian bandits. Specifically, for the gap-based algorithm, the sample complexity is optimal up to constant factors, while for the successive elimination, it is optimal up to logarithmic factors.
arXiv Detail & Related papers (2021-11-14T21:49:58Z)
An Empirical Process Approach to the Union Bound: Practical Algorithms for Combinatorial and Linear Bandits [34.06611065493047]
This paper proposes near-optimal algorithms for the pure-exploration linear bandit problem in the fixed confidence and fixed budget settings. We provide an algorithm whose sample complexity scales with the geometry of the instance and avoids an explicit union bound over the number of arms. We also propose the first algorithm for linear bandits in the the fixed budget setting.
arXiv Detail & Related papers (2020-06-21T00:56:33Z)
Non-Adaptive Adaptive Sampling on Turnstile Streams [57.619901304728366]
We give the first relative-error algorithms for column subset selection, subspace approximation, projective clustering, and volume on turnstile streams that use space sublinear in $n$. Our adaptive sampling procedure has a number of applications to various data summarization problems that either improve state-of-the-art or have only been previously studied in the more relaxed row-arrival model.
arXiv Detail & Related papers (2020-04-23T05:00:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.