High Dimensional Statistical Estimation under One-bit Quantization
- URL: http://arxiv.org/abs/2202.13157v1
- Date: Sat, 26 Feb 2022 15:13:04 GMT
- Title: High Dimensional Statistical Estimation under One-bit Quantization
- Authors: Junren Chen, Cheng-Long Wang, Michael K. Ng, Di Wang
- Abstract summary: One-bit (binary) data are preferable in many applications because of the efficiency in signal storage, processing, transmission, and enhancement of privacy.
In this paper, we study three fundamental statistical estimation problems.
Under both sub-Gaussian and heavy-tailed regimes, new estimators that handle high-dimensional scaling are proposed.
- Score: 27.718986773043643
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Compared with data with high precision, one-bit (binary) data are preferable
in many applications because of the efficiency in signal storage, processing,
transmission, and enhancement of privacy. In this paper, we study three
fundamental statistical estimation problems, i.e., sparse covariance matrix
estimation, sparse linear regression, and low-rank matrix completion via binary
data arising from an easy-to-implement one-bit quantization process that
contains truncation, dithering and quantization as typical steps. Under both
sub-Gaussian and heavy-tailed regimes, new estimators that handle
high-dimensional scaling are proposed. In sub-Gaussian case, we show that our
estimators achieve minimax rates up to logarithmic factors, hence the
quantization nearly costs nothing from the perspective of statistical learning
rate. In heavy-tailed case, we truncate the data before dithering to achieve a
bias-variance trade-off, which results in estimators embracing convergence
rates that are the square root of the corresponding minimax rates. Experimental
results on synthetic data are reported to support and demonstrate the
statistical properties of our estimators under one-bit quantization.
Related papers
- Minimally Supervised Learning using Topological Projections in
Self-Organizing Maps [55.31182147885694]
We introduce a semi-supervised learning approach based on topological projections in self-organizing maps (SOMs)
Our proposed method first trains SOMs on unlabeled data and then a minimal number of available labeled data points are assigned to key best matching units (BMU)
Our results indicate that the proposed minimally supervised model significantly outperforms traditional regression techniques.
arXiv Detail & Related papers (2024-01-12T22:51:48Z) - Multi-Fidelity Covariance Estimation in the Log-Euclidean Geometry [0.0]
We introduce a multi-fidelity estimator of covariance matrices that employs the log-Euclidean geometry of the symmetric positive-definite manifold.
We develop an optimal sample allocation scheme that minimizes the mean-squared error of the estimator given a fixed budget.
Evaluations of our approach using data from physical applications demonstrate more accurate metric learning and speedups of more than one order of magnitude compared to benchmarks.
arXiv Detail & Related papers (2023-01-31T16:33:46Z) - Quantizing Heavy-tailed Data in Statistical Estimation: (Near) Minimax
Rates, Covariate Quantization, and Uniform Recovery [22.267482745621997]
This paper studies the quantization of heavy-tailed data in some fundamental statistical estimation problems.
We propose to truncate and properly dither the data prior to a uniform quantization.
arXiv Detail & Related papers (2022-12-30T06:28:30Z) - Optimized numerical gradient and Hessian estimation for variational
quantum algorithms [0.0]
We show that tunable numerical estimators offer estimation errors that drop exponentially with the number of circuit qubits.
We demonstrate that the scaled parameter-shift estimators beat the standard unscaled ones in estimation accuracy under any situation.
arXiv Detail & Related papers (2022-06-25T12:58:44Z) - Semi-Supervised Quantile Estimation: Robust and Efficient Inference in High Dimensional Settings [0.5735035463793009]
We consider quantile estimation in a semi-supervised setting, characterized by two available data sets.
We propose a family of semi-supervised estimators for the response quantile(s) based on the two data sets.
arXiv Detail & Related papers (2022-01-25T10:02:23Z) - Bias-Variance Tradeoffs in Single-Sample Binary Gradient Estimators [100.58924375509659]
Straight-through (ST) estimator gained popularity due to its simplicity and efficiency.
Several techniques were proposed to improve over ST while keeping the same low computational complexity.
We conduct a theoretical analysis of Bias and Variance of these methods in order to understand tradeoffs and verify originally claimed properties.
arXiv Detail & Related papers (2021-10-07T15:16:07Z) - Diverse Sample Generation: Pushing the Limit of Data-free Quantization [85.95032037447454]
This paper presents a generic Diverse Sample Generation scheme for the generative data-free post-training quantization and quantization-aware training.
For large-scale image classification tasks, our DSG can consistently outperform existing data-free quantization methods.
arXiv Detail & Related papers (2021-09-01T07:06:44Z) - Heavy-tailed Streaming Statistical Estimation [58.70341336199497]
We consider the task of heavy-tailed statistical estimation given streaming $p$ samples.
We design a clipped gradient descent and provide an improved analysis under a more nuanced condition on the noise of gradients.
arXiv Detail & Related papers (2021-08-25T21:30:27Z) - CoinPress: Practical Private Mean and Covariance Estimation [18.6419638570742]
We present simple differentially private estimators for the mean and covariance of multivariate sub-Gaussian data.
We show that their error rates match the state-of-the-art theoretical bounds, and that they concretely outperform all previous methods.
arXiv Detail & Related papers (2020-06-11T17:17:28Z) - Instability, Computational Efficiency and Statistical Accuracy [101.32305022521024]
We develop a framework that yields statistical accuracy based on interplay between the deterministic convergence rate of the algorithm at the population level, and its degree of (instability) when applied to an empirical object based on $n$ samples.
We provide applications of our general results to several concrete classes of models, including Gaussian mixture estimation, non-linear regression models, and informative non-response models.
arXiv Detail & Related papers (2020-05-22T22:30:52Z) - Compressing Large Sample Data for Discriminant Analysis [78.12073412066698]
We consider the computational issues due to large sample size within the discriminant analysis framework.
We propose a new compression approach for reducing the number of training samples for linear and quadratic discriminant analysis.
arXiv Detail & Related papers (2020-05-08T05:09:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.