Related papers: Adaptive Estimation of Random Vectors with Bandit Feedback: A mean-squared error viewpoint

Adaptive Estimation of Random Vectors with Bandit Feedback: A mean-squared error viewpoint

URL: http://arxiv.org/abs/2203.16810v3
Date: Thu, 11 Jan 2024 05:44:18 GMT
Title: Adaptive Estimation of Random Vectors with Bandit Feedback: A mean-squared error viewpoint
Authors: Dipayan Sen, L.A. Prashanth and Aditya Gopalan
Abstract summary: We first establish a concentration bound for MSE estimation. We then frame the estimation problem with bandit feedback, and propose a variant of the successive elimination algorithm.
Score: 13.089182408360225
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We consider the problem of sequentially learning to estimate, in the mean squared error (MSE) sense, a Gaussian $K$-vector of unknown covariance by observing only $m < K$ of its entries in each round. We first establish a concentration bound for MSE estimation. We then frame the estimation problem with bandit feedback, and propose a variant of the successive elimination algorithm. We also derive a minimax lower bound to understand the fundamental limit on the sample complexity of this problem.

Related papers

Error Feedback under $(L_0,L_1)$-Smoothness: Normalization and Momentum [56.37522020675243]
We provide the first proof of convergence for normalized error feedback algorithms across a wide range of machine learning problems. We show that due to their larger allowable stepsizes, our new normalized error feedback algorithms outperform their non-normalized counterparts on various tasks.
arXiv Detail & Related papers (2024-10-22T10:19:27Z)
GROS: A General Robust Aggregation Strategy [49.1574468325115]
A new, very general, robust procedure for combining estimators in metric spaces is introduced. We show that the same (up to a constant) sub-Gaussianity is obtained if the minimization is taken over the sample. The performance of GROS is evaluated through five simulation studies.
arXiv Detail & Related papers (2024-02-23T17:00:32Z)
Best Arm Identification with Fixed Budget: A Large Deviation Perspective [54.305323903582845]
We present sred, a truly adaptive algorithm that can reject arms in it any round based on the observed empirical gaps between the rewards of various arms. In particular, we present sred, a truly adaptive algorithm that can reject arms in it any round based on the observed empirical gaps between the rewards of various arms.
arXiv Detail & Related papers (2023-12-19T13:17:43Z)
Direct Measure Matching for Crowd Counting [59.66286603624411]
We propose a new measure-based counting approach to regress the predicted density maps to the scattered point-annotated ground truth directly. In this paper, we derive a semi-balanced form of Sinkhorn divergence, based on which a Sinkhorn counting loss is designed for measure matching.
arXiv Detail & Related papers (2021-07-04T06:37:33Z)
SNIPS: Solving Noisy Inverse Problems Stochastically [25.567566997688044]
We introduce a novel algorithm dubbed SNIPS, which draws samples from the posterior distribution of any linear inverse problem. Our solution incorporates ideas from Langevin dynamics and Newton's method, and exploits a pre-trained minimum mean squared error (MMSE) We show that the samples produced are sharp, detailed and consistent with the given measurements, and their diversity exposes the inherent uncertainty in the inverse problem being solved.
arXiv Detail & Related papers (2021-05-31T13:33:21Z)
An Empirical Process Approach to the Union Bound: Practical Algorithms for Combinatorial and Linear Bandits [34.06611065493047]
This paper proposes near-optimal algorithms for the pure-exploration linear bandit problem in the fixed confidence and fixed budget settings. We provide an algorithm whose sample complexity scales with the geometry of the instance and avoids an explicit union bound over the number of arms. We also propose the first algorithm for linear bandits in the the fixed budget setting.
arXiv Detail & Related papers (2020-06-21T00:56:33Z)
Learning Minimax Estimators via Online Learning [55.92459567732491]
We consider the problem of designing minimax estimators for estimating parameters of a probability distribution. We construct an algorithm for finding a mixed-case Nash equilibrium.
arXiv Detail & Related papers (2020-06-19T22:49:42Z)
Explicit Mean-Square Error Bounds for Monte-Carlo and Linear Stochastic Approximation [4.817429789586127]
It is not possible to obtain a Hoeffding bound on the error sequence, even when the underlying Markov chain is reversible and geometrically ergodic. It is shown that mean square error achieves the optimal rate of $O(1/n)$, subject to conditions on the step-size sequence.
arXiv Detail & Related papers (2020-02-07T01:52:21Z)
Thompson Sampling Algorithms for Mean-Variance Bandits [97.43678751629189]
We develop Thompson Sampling-style algorithms for mean-variance MAB. We also provide comprehensive regret analyses for Gaussian and Bernoulli bandits. Our algorithms significantly outperform existing LCB-based algorithms for all risk tolerances.
arXiv Detail & Related papers (2020-02-01T15:33:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.