Online stochastic Newton methods for estimating the geometric median and
applications
- URL: http://arxiv.org/abs/2304.00770v1
- Date: Mon, 3 Apr 2023 07:47:20 GMT
- Title: Online stochastic Newton methods for estimating the geometric median and
applications
- Authors: Antoine Godichon-Baggioni (LPSM (UMR\_8001)), Wei Lu (LMI)
- Abstract summary: In the context of large samples, a small number of individuals might spoil basic statistical indicators like the mean.
This paper focuses on estimating the geometric median of a random variable, which is a robust indicator of central tendency.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the context of large samples, a small number of individuals might spoil
basic statistical indicators like the mean. It is difficult to detect
automatically these atypical individuals, and an alternative strategy is using
robust approaches. This paper focuses on estimating the geometric median of a
random variable, which is a robust indicator of central tendency. In order to
deal with large samples of data arriving sequentially, online stochastic Newton
algorithms for estimating the geometric median are introduced and we give their
rates of convergence. Since estimates of the median and those of the Hessian
matrix can be recursively updated, we also determine confidences intervals of
the median in any designated direction and perform online statistical tests.
Related papers
- Robust Estimation for Kernel Exponential Families with Smoothed Total Variation Distances [2.317910166616341]
In statistical inference, we commonly assume that samples are independent and identically distributed from a probability distribution.
In this paper, we explore the application of GAN-like estimators to a general class of statistical models.
arXiv Detail & Related papers (2024-10-28T05:50:47Z) - SGD with Clipping is Secretly Estimating the Median Gradient [19.69067856415625]
We study distributed learning with corrupted nodes, the presence of large outliers in the training data, learning under privacy constraints, or even heavy-tailed noise due to the dynamics of the algorithm itself.
We first consider computing the median gradient across samples, and show that the resulting method can converge even under heavy-tailed state-dependent noise.
We propose an algorithm estimating the median gradient across iterations, and find that several well known methods - in particular different forms of clipping - are particular cases of this framework.
arXiv Detail & Related papers (2024-02-20T08:54:07Z) - Multi-Fidelity Covariance Estimation in the Log-Euclidean Geometry [0.0]
We introduce a multi-fidelity estimator of covariance matrices that employs the log-Euclidean geometry of the symmetric positive-definite manifold.
We develop an optimal sample allocation scheme that minimizes the mean-squared error of the estimator given a fixed budget.
Evaluations of our approach using data from physical applications demonstrate more accurate metric learning and speedups of more than one order of magnitude compared to benchmarks.
arXiv Detail & Related papers (2023-01-31T16:33:46Z) - Wasserstein Iterative Networks for Barycenter Estimation [80.23810439485078]
We present an algorithm to approximate the Wasserstein-2 barycenters of continuous measures via a generative model.
Based on the celebrity faces dataset, we construct Ave, celeba! dataset which can be used for quantitative evaluation of barycenter algorithms.
arXiv Detail & Related papers (2022-01-28T16:59:47Z) - Sampling from Arbitrary Functions via PSD Models [55.41644538483948]
We take a two-step approach by first modeling the probability distribution and then sampling from that model.
We show that these models can approximate a large class of densities concisely using few evaluations, and present a simple algorithm to effectively sample from these models.
arXiv Detail & Related papers (2021-10-20T12:25:22Z) - Manifold Hypothesis in Data Analysis: Double Geometrically-Probabilistic
Approach to Manifold Dimension Estimation [92.81218653234669]
We present new approach to manifold hypothesis checking and underlying manifold dimension estimation.
Our geometrical method is a modification for sparse data of a well-known box-counting algorithm for Minkowski dimension calculation.
Experiments on real datasets show that the suggested approach based on two methods combination is powerful and effective.
arXiv Detail & Related papers (2021-07-08T15:35:54Z) - Near-optimal inference in adaptive linear regression [60.08422051718195]
Even simple methods like least squares can exhibit non-normal behavior when data is collected in an adaptive manner.
We propose a family of online debiasing estimators to correct these distributional anomalies in at least squares estimation.
We demonstrate the usefulness of our theory via applications to multi-armed bandit, autoregressive time series estimation, and active learning with exploration.
arXiv Detail & Related papers (2021-07-05T21:05:11Z) - Estimating leverage scores via rank revealing methods and randomization [50.591267188664666]
We study algorithms for estimating the statistical leverage scores of rectangular dense or sparse matrices of arbitrary rank.
Our approach is based on combining rank revealing methods with compositions of dense and sparse randomized dimensionality reduction transforms.
arXiv Detail & Related papers (2021-05-23T19:21:55Z) - SLOE: A Faster Method for Statistical Inference in High-Dimensional
Logistic Regression [68.66245730450915]
We develop an improved method for debiasing predictions and estimating frequentist uncertainty for practical datasets.
Our main contribution is SLOE, an estimator of the signal strength with convergence guarantees that reduces the computation time of estimation and inference by orders of magnitude.
arXiv Detail & Related papers (2021-03-23T17:48:56Z) - Learning with Comparison Feedback: Online Estimation of Sample
Statistics [2.7158841992922875]
We study an online version of the noisy binary search problem where feedback is generated by a non-stochastic rather than perturbed by random noise.
We maintain an accurate estimate for the median of an adversary adversarial sequence of integers.
arXiv Detail & Related papers (2021-01-11T20:28:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.