Related papers: A Parameter-Free Two-Bit Covariance Estimator with Improved Operator Norm Error Rate

A Parameter-Free Two-Bit Covariance Estimator with Improved Operator Norm Error Rate

URL: http://arxiv.org/abs/2308.16059v1
Date: Wed, 30 Aug 2023 14:31:24 GMT
Title: A Parameter-Free Two-Bit Covariance Estimator with Improved Operator Norm Error Rate
Authors: Junren Chen, Michael K. Ng
Abstract summary: We propose a new 2-bit covariance matrix estimator that simultaneously addresses both issues. By employing dithering scales varying across entries, our estimator enjoys an improved operator norm error rate. Our proposed method eliminates the need of any tuning parameter, as the dithering scales are entirely determined by the data.
Score: 27.308933056578212
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: A covariance matrix estimator using two bits per entry was recently developed by Dirksen, Maly and Rauhut [Annals of Statistics, 50(6), pp. 3538-3562]. The estimator achieves near minimax rate for general sub-Gaussian distributions, but also suffers from two downsides: theoretically, there is an essential gap on operator norm error between their estimator and sample covariance when the diagonal of the covariance matrix is dominated by only a few entries; practically, its performance heavily relies on the dithering scale, which needs to be tuned according to some unknown parameters. In this work, we propose a new 2-bit covariance matrix estimator that simultaneously addresses both issues. Unlike the sign quantizer associated with uniform dither in Dirksen et al., we adopt a triangular dither prior to a 2-bit quantizer inspired by the multi-bit uniform quantizer. By employing dithering scales varying across entries, our estimator enjoys an improved operator norm error rate that depends on the effective rank of the underlying covariance matrix rather than the ambient dimension, thus closing the theoretical gap. Moreover, our proposed method eliminates the need of any tuning parameter, as the dithering scales are entirely determined by the data. Experimental results under Gaussian samples are provided to showcase the impressive numerical performance of our estimator. Remarkably, by halving the dithering scales, our estimator oftentimes achieves operator norm errors less than twice of the errors of sample covariance.

Related papers

Multivariate root-n-consistent smoothing parameter free matching estimators and estimators of inverse density weighted expectations [51.000851088730684]
We develop novel modifications of nearest-neighbor and matching estimators which converge at the parametric $sqrt n $-rate. We stress that our estimators do not involve nonparametric function estimators and in particular do not rely on sample-size dependent parameters smoothing.
arXiv Detail & Related papers (2024-07-11T13:28:34Z)
A Geometric Unification of Distributionally Robust Covariance Estimators: Shrinking the Spectrum by Inflating the Ambiguity Set [20.166217494056916]
We propose a principled approach to construct covariance estimators without imposing restrictive assumptions. We show that our robust estimators are efficiently computable and consistent. Numerical experiments based on synthetic and real data show that our robust estimators are competitive with state-of-the-art estimators.
arXiv Detail & Related papers (2024-05-30T15:01:18Z)
Nearest Neighbor Sampling for Covariate Shift Adaptation [7.940293148084844]
We propose a new covariate shift adaptation method which avoids estimating the weights. The basic idea is to directly work on unlabeled target data, labeled according to the $k$-nearest neighbors in the source dataset. Our experiments show that it achieves drastic reduction in the running time with remarkable accuracy.
arXiv Detail & Related papers (2023-12-15T17:28:09Z)
Batches Stabilize the Minimum Norm Risk in High Dimensional Overparameterized Linear Regression [12.443289202402761]
We show the benefits of batch- partitioning through the lens of a minimum-norm overparametrized linear regression model. We characterize the optimal batch size and show it is inversely proportional to the noise level. We also show that shrinking the batch minimum-norm estimator by a factor equal to the Weiner coefficient further stabilizes it and results in lower quadratic risk in all settings.
arXiv Detail & Related papers (2023-06-14T11:02:08Z)
Quantizing Heavy-tailed Data in Statistical Estimation: (Near) Minimax Rates, Covariate Quantization, and Uniform Recovery [22.267482745621997]
This paper studies the quantization of heavy-tailed data in some fundamental statistical estimation problems. We propose to truncate and properly dither the data prior to a uniform quantization.
arXiv Detail & Related papers (2022-12-30T06:28:30Z)
Noise Covariance Estimation in Multi-Task High-dimensional Linear Models [8.807375890824977]
This paper studies the multi-task high-dimensional linear regression models where the noise among different tasks is correlated. Treating the regression coefficients as a nuisance parameter, we leverage the multi-task elastic-net and multi-task lasso estimators to estimate the nuisance. Under suitable conditions, the proposed estimator attains the same rate of convergence as the "oracle" estimator.
arXiv Detail & Related papers (2022-06-15T02:37:37Z)
Optimal policy evaluation using kernel-based temporal difference methods [78.83926562536791]
We use kernel Hilbert spaces for estimating the value function of an infinite-horizon discounted Markov reward process. We derive a non-asymptotic upper bound on the error with explicit dependence on the eigenvalues of the associated kernel operator. We prove minimax lower bounds over sub-classes of MRPs.
arXiv Detail & Related papers (2021-09-24T14:48:20Z)
Rao-Blackwellizing the Straight-Through Gumbel-Softmax Gradient Estimator [93.05919133288161]
We show that the variance of the straight-through variant of the popular Gumbel-Softmax estimator can be reduced through Rao-Blackwellization. This provably reduces the mean squared error. We empirically demonstrate that this leads to variance reduction, faster convergence, and generally improved performance in two unsupervised latent variable models.
arXiv Detail & Related papers (2020-10-09T22:54:38Z)
Nonparametric Estimation of the Fisher Information and Its Applications [82.00720226775964]
This paper considers the problem of estimation of the Fisher information for location from a random sample of size $n$. An estimator proposed by Bhattacharya is revisited and improved convergence rates are derived. A new estimator, termed a clipped estimator, is proposed.
arXiv Detail & Related papers (2020-05-07T17:21:56Z)
SUMO: Unbiased Estimation of Log Marginal Probability for Latent Variable Models [80.22609163316459]
We introduce an unbiased estimator of the log marginal likelihood and its gradients for latent variable models based on randomized truncation of infinite series. We show that models trained using our estimator give better test-set likelihoods than a standard importance-sampling based approach for the same average computational cost.
arXiv Detail & Related papers (2020-04-01T11:49:30Z)
Estimating Gradients for Discrete Random Variables by Sampling without Replacement [93.09326095997336]
We derive an unbiased estimator for expectations over discrete random variables based on sampling without replacement. We show that our estimator can be derived as the Rao-Blackwellization of three different estimators.
arXiv Detail & Related papers (2020-02-14T14:15:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.