Learning with Comparison Feedback: Online Estimation of Sample
Statistics
- URL: http://arxiv.org/abs/2101.04176v1
- Date: Mon, 11 Jan 2021 20:28:32 GMT
- Title: Learning with Comparison Feedback: Online Estimation of Sample
Statistics
- Authors: Michela Meister and Sloan Nietert
- Abstract summary: We study an online version of the noisy binary search problem where feedback is generated by a non-stochastic rather than perturbed by random noise.
We maintain an accurate estimate for the median of an adversary adversarial sequence of integers.
- Score: 2.7158841992922875
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study an online version of the noisy binary search problem where feedback
is generated by a non-stochastic adversary rather than perturbed by random
noise. We reframe this as maintaining an accurate estimate for the median of an
adversarial sequence of integers, $x_1, x_2, \dots$, in a model where each
number $x_t$ can only be accessed through a single threshold query of the form
${1(x_t \leq q_t)}$. In this online comparison feedback model, we explore
estimation of general sample statistics, providing robust algorithms for
median, CDF, and mean estimation with nearly matching lower bounds. We conclude
with several high-dimensional generalizations.
Related papers
- GROS: A General Robust Aggregation Strategy [49.1574468325115]
A new, very general, robust procedure for combining estimators in metric spaces is introduced.
We show that the same (up to a constant) sub-Gaussianity is obtained if the minimization is taken over the sample.
The performance of GROS is evaluated through five simulation studies.
arXiv Detail & Related papers (2024-02-23T17:00:32Z) - Online stochastic Newton methods for estimating the geometric median and
applications [0.0]
In the context of large samples, a small number of individuals might spoil basic statistical indicators like the mean.
This paper focuses on estimating the geometric median of a random variable, which is a robust indicator of central tendency.
arXiv Detail & Related papers (2023-04-03T07:47:20Z) - Replicable Clustering [57.19013971737493]
We propose algorithms for the statistical $k$-medians, statistical $k$-means, and statistical $k$-centers problems by utilizing approximation routines for their counterparts in a black-box manner.
We also provide experiments on synthetic distributions in 2D using the $k$-means++ implementation from sklearn as a black-box that validate our theoretical results.
arXiv Detail & Related papers (2023-02-20T23:29:43Z) - Stochastic Approximation Approaches to Group Distributionally Robust
Optimization [96.26317627118912]
Group distributionally robust optimization (GDRO)
Online learning techniques to reduce the number of samples required in each round from $m$ to $1$, keeping the same sample.
A novel formulation of weighted GDRO, which allows us to derive distribution-dependent convergence rates.
arXiv Detail & Related papers (2023-02-18T09:24:15Z) - Non-Stochastic CDF Estimation Using Threshold Queries [3.6576781735746513]
We tackle the problem of estimating an empirical distribution in a setting with two challenges.
First, the algorithm does not directly observe the data; instead, it only asks a limited number of threshold queries about each sample.
Second, the data are not assumed to be independent and identically distributed; instead, we allow for an arbitrary process generating the samples.
arXiv Detail & Related papers (2023-01-13T18:00:57Z) - Convergence for score-based generative modeling with polynomial
complexity [9.953088581242845]
We prove the first convergence guarantees for the core mechanic behind Score-based generative modeling.
Compared to previous works, we do not incur error that grows exponentially in time or that suffers from a curse of dimensionality.
We show that a predictor-corrector gives better convergence than using either portion alone.
arXiv Detail & Related papers (2022-06-13T14:57:35Z) - Distributed Sparse Regression via Penalization [5.990069843501885]
We study linear regression over a network of agents, modeled as an undirected graph (with no centralized node)
The estimation problem is formulated as the minimization of the sum of the local LASSO loss functions plus a quadratic penalty of the consensus constraint.
We show that the proximal-gradient algorithm applied to the penalized problem converges linearly up to a tolerance of the order of the centralized statistical error.
arXiv Detail & Related papers (2021-11-12T01:51:50Z) - Sampling from Arbitrary Functions via PSD Models [55.41644538483948]
We take a two-step approach by first modeling the probability distribution and then sampling from that model.
We show that these models can approximate a large class of densities concisely using few evaluations, and present a simple algorithm to effectively sample from these models.
arXiv Detail & Related papers (2021-10-20T12:25:22Z) - Multi-label Contrastive Predictive Coding [125.03510235962095]
Variational mutual information (MI) estimators are widely used in unsupervised representation learning methods such as contrastive predictive coding (CPC)
We introduce a novel estimator based on a multi-label classification problem, where the critic needs to jointly identify multiple positive samples at the same time.
We show that using the same amount of negative samples, multi-label CPC is able to exceed the $log m$ bound, while still being a valid lower bound of mutual information.
arXiv Detail & Related papers (2020-07-20T02:46:21Z) - Computationally efficient sparse clustering [67.95910835079825]
We provide a finite sample analysis of a new clustering algorithm based on PCA.
We show that it achieves the minimax optimal misclustering rate in the regime $|theta infty$.
arXiv Detail & Related papers (2020-05-21T17:51:30Z) - Learning Entangled Single-Sample Distributions via Iterative Trimming [28.839136703139225]
We analyze a simple and computationally efficient method based on iteratively trimming samples and re-estimating the parameter on the trimmed sample set.
We show that the method in logarithmic iterations outputs an estimation whose error only depends on the noise level of the $lceil alpha n rceil$-th noisiest data point.
arXiv Detail & Related papers (2020-04-20T18:37:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.