Quantizing Heavy-tailed Data in Statistical Estimation: (Near) Minimax
Rates, Covariate Quantization, and Uniform Recovery
- URL: http://arxiv.org/abs/2212.14562v2
- Date: Wed, 26 Jul 2023 09:10:09 GMT
- Title: Quantizing Heavy-tailed Data in Statistical Estimation: (Near) Minimax
Rates, Covariate Quantization, and Uniform Recovery
- Authors: Junren Chen, Michael K. Ng, Di Wang
- Abstract summary: This paper studies the quantization of heavy-tailed data in some fundamental statistical estimation problems.
We propose to truncate and properly dither the data prior to a uniform quantization.
- Score: 22.267482745621997
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper studies the quantization of heavy-tailed data in some fundamental
statistical estimation problems, where the underlying distributions have
bounded moments of some order. We propose to truncate and properly dither the
data prior to a uniform quantization. Our major standpoint is that (near)
minimax rates of estimation error are achievable merely from the quantized data
produced by the proposed scheme. In particular, concrete results are worked out
for covariance estimation, compressed sensing, and matrix completion, all
agreeing that the quantization only slightly worsens the multiplicative factor.
Besides, we study compressed sensing where both covariate (i.e., sensing
vector) and response are quantized. Under covariate quantization, although our
recovery program is non-convex because the covariance matrix estimator lacks
positive semi-definiteness, all local minimizers are proved to enjoy near
optimal error bound. Moreover, by the concentration inequality of product
process and covering argument, we establish near minimax uniform recovery
guarantee for quantized compressed sensing with heavy-tailed noise.
Related papers
- Relaxed Quantile Regression: Prediction Intervals for Asymmetric Noise [51.87307904567702]
Quantile regression is a leading approach for obtaining such intervals via the empirical estimation of quantiles in the distribution of outputs.
We propose Relaxed Quantile Regression (RQR), a direct alternative to quantile regression based interval construction that removes this arbitrary constraint.
We demonstrate that this added flexibility results in intervals with an improvement in desirable qualities.
arXiv Detail & Related papers (2024-06-05T13:36:38Z) - A Geometric Unification of Distributionally Robust Covariance Estimators: Shrinking the Spectrum by Inflating the Ambiguity Set [20.166217494056916]
We propose a principled approach to construct covariance estimators without imposing restrictive assumptions.
We show that our robust estimators are efficiently computable and consistent.
Numerical experiments based on synthetic and real data show that our robust estimators are competitive with state-of-the-art estimators.
arXiv Detail & Related papers (2024-05-30T15:01:18Z) - A Parameter-Free Two-Bit Covariance Estimator with Improved Operator
Norm Error Rate [27.308933056578212]
We propose a new 2-bit covariance matrix estimator that simultaneously addresses both issues.
By employing dithering scales varying across entries, our estimator enjoys an improved operator norm error rate.
Our proposed method eliminates the need of any tuning parameter, as the dithering scales are entirely determined by the data.
arXiv Detail & Related papers (2023-08-30T14:31:24Z) - Quantized Low-Rank Multivariate Regression with Random Dithering [23.81618208119832]
Low-rank multivariate regression (LRMR) is an important statistical learning model.
We focus on the estimation of the underlying coefficient matrix.
We employ uniform quantization with random dithering, i.e., we add appropriate random noise to the data before quantization.
We propose the constrained Lasso and regularized Lasso estimators, and derive the non-asymptotic error bounds.
arXiv Detail & Related papers (2023-02-22T08:14:24Z) - Importance sampling for stochastic quantum simulations [68.8204255655161]
We introduce the qDrift protocol, which builds random product formulas by sampling from the Hamiltonian according to the coefficients.
We show that the simulation cost can be reduced while achieving the same accuracy, by considering the individual simulation cost during the sampling stage.
Results are confirmed by numerical simulations performed on a lattice nuclear effective field theory.
arXiv Detail & Related papers (2022-12-12T15:06:32Z) - High Dimensional Statistical Estimation under One-bit Quantization [27.718986773043643]
One-bit (binary) data are preferable in many applications because of the efficiency in signal storage, processing, transmission, and enhancement of privacy.
In this paper, we study three fundamental statistical estimation problems.
Under both sub-Gaussian and heavy-tailed regimes, new estimators that handle high-dimensional scaling are proposed.
arXiv Detail & Related papers (2022-02-26T15:13:04Z) - Optimal policy evaluation using kernel-based temporal difference methods [78.83926562536791]
We use kernel Hilbert spaces for estimating the value function of an infinite-horizon discounted Markov reward process.
We derive a non-asymptotic upper bound on the error with explicit dependence on the eigenvalues of the associated kernel operator.
We prove minimax lower bounds over sub-classes of MRPs.
arXiv Detail & Related papers (2021-09-24T14:48:20Z) - Heavy-tailed Streaming Statistical Estimation [58.70341336199497]
We consider the task of heavy-tailed statistical estimation given streaming $p$ samples.
We design a clipped gradient descent and provide an improved analysis under a more nuanced condition on the noise of gradients.
arXiv Detail & Related papers (2021-08-25T21:30:27Z) - Mean-squared-error-based adaptive estimation of pure quantum states and
unitary transformations [0.0]
We propose a method to estimate with high accuracy pure quantum states of a single qudit.
Our method is based on the minimization of the squared error between the complex probability amplitudes of the unknown state and its estimate.
We show that our estimation procedure can be easily extended to estimate unknown unitary transformations acting on a single qudit.
arXiv Detail & Related papers (2020-08-23T00:32:10Z) - Instability, Computational Efficiency and Statistical Accuracy [101.32305022521024]
We develop a framework that yields statistical accuracy based on interplay between the deterministic convergence rate of the algorithm at the population level, and its degree of (instability) when applied to an empirical object based on $n$ samples.
We provide applications of our general results to several concrete classes of models, including Gaussian mixture estimation, non-linear regression models, and informative non-response models.
arXiv Detail & Related papers (2020-05-22T22:30:52Z) - Minimax Optimal Estimation of KL Divergence for Continuous Distributions [56.29748742084386]
Esting Kullback-Leibler divergence from identical and independently distributed samples is an important problem in various domains.
One simple and effective estimator is based on the k nearest neighbor between these samples.
arXiv Detail & Related papers (2020-02-26T16:37:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.