On the Efficient Implementation of High Accuracy Optimality of Profile
Maximum Likelihood
- URL: http://arxiv.org/abs/2210.06728v1
- Date: Thu, 13 Oct 2022 04:52:15 GMT
- Title: On the Efficient Implementation of High Accuracy Optimality of Profile
Maximum Likelihood
- Authors: Moses Charikar, Zhihao Jiang, Kirankumar Shiragur, Aaron Sidford
- Abstract summary: Our estimator is based on profile-maximum-likelihood (PML) and is sample optimal for estimating various symmetric properties.
This result improves upon the previous best accuracy threshold of $epsilon gg n-1/4$ by achievable time.
- Score: 33.051282474765955
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We provide an efficient unified plug-in approach for estimating symmetric
properties of distributions given $n$ independent samples. Our estimator is
based on profile-maximum-likelihood (PML) and is sample optimal for estimating
various symmetric properties when the estimation error $\epsilon \gg n^{-1/3}$.
This result improves upon the previous best accuracy threshold of $\epsilon \gg
n^{-1/4}$ achievable by polynomial time computable PML-based universal
estimators [ACSS21, ACSS20]. Our estimator reaches a theoretical limit for
universal symmetric property estimation as [Han21] shows that a broad class of
universal estimators (containing many well known approaches including ours)
cannot be sample optimal for every $1$-Lipschitz property when $\epsilon \ll
n^{-1/3}$.
Related papers
- One Sample Fits All: Approximating All Probabilistic Values Simultaneously and Efficiently [19.265709097637643]
Probable values have gained recent attention in applications like feature attribution and data valuation.
We propose a one-sample-fits-all framework parameterized by a sampling vector to approximate intermediate terms.
We show that our one-for-all estimator achieves the fastest convergence rate on Beta Shapley values.
arXiv Detail & Related papers (2024-10-31T10:47:46Z) - Sparse PCA with Oracle Property [115.72363972222622]
We propose a family of estimators based on the semidefinite relaxation of sparse PCA with novel regularizations.
We prove that, another estimator within the family achieves a sharper statistical rate of convergence than the standard semidefinite relaxation of sparse PCA.
arXiv Detail & Related papers (2023-12-28T02:52:54Z) - General Gaussian Noise Mechanisms and Their Optimality for Unbiased Mean
Estimation [58.03500081540042]
A classical approach to private mean estimation is to compute the true mean and add unbiased, but possibly correlated, Gaussian noise to it.
We show that for every input dataset, an unbiased mean estimator satisfying concentrated differential privacy introduces approximately at least as much error.
arXiv Detail & Related papers (2023-01-31T18:47:42Z) - Kernel-based off-policy estimation without overlap: Instance optimality
beyond semiparametric efficiency [53.90687548731265]
We study optimal procedures for estimating a linear functional based on observational data.
For any convex and symmetric function class $mathcalF$, we derive a non-asymptotic local minimax bound on the mean-squared error.
arXiv Detail & Related papers (2023-01-16T02:57:37Z) - Optimal Scaling for Locally Balanced Proposals in Discrete Spaces [65.14092237705476]
We show that efficiency of Metropolis-Hastings (M-H) algorithms in discrete spaces can be characterized by an acceptance rate that is independent of the target distribution.
Knowledge of the optimal acceptance rate allows one to automatically tune the neighborhood size of a proposal distribution in a discrete space, directly analogous to step-size control in continuous spaces.
arXiv Detail & Related papers (2022-09-16T22:09:53Z) - Instance Based Approximations to Profile Maximum Likelihood [33.51964370430905]
We provide a new efficient algorithm for approximately computing the profile maximum likelihood (PML) distribution.
We obtain the first provable computationally efficient implementation of PseudoPML, a framework for estimating a broad class of symmetric properties.
arXiv Detail & Related papers (2020-11-05T11:17:19Z) - $\gamma$-ABC: Outlier-Robust Approximate Bayesian Computation Based on a
Robust Divergence Estimator [95.71091446753414]
We propose to use a nearest-neighbor-based $gamma$-divergence estimator as a data discrepancy measure.
Our method achieves significantly higher robustness than existing discrepancy measures.
arXiv Detail & Related papers (2020-06-13T06:09:27Z) - Extrapolating the profile of a finite population [35.69057741775438]
We study a prototypical problem in empirical Bayes. Namely, consider a population consisting of $k$ individuals each belonging to one of $k$ types.
We show that in the sublinear regime of $m =omega(k/log k)$, it is possible to consistently estimate in total variation the emphprofile of the population.
arXiv Detail & Related papers (2020-05-21T10:39:41Z) - A General Framework for Symmetric Property Estimation [35.14819168275207]
We identify the easy region where empirical estimation works and the difficult region where more complex estimators are required.
We show that by approximately computing the profile maximum likelihood (PML) distribution citeADOS16 in this difficult region we obtain a symmetric property estimation framework that is sample complexity optimal for many properties.
arXiv Detail & Related papers (2020-03-02T13:00:04Z) - Profile Entropy: A Fundamental Measure for the Learnability and
Compressibility of Discrete Distributions [63.60499266361255]
We show that for samples of discrete distributions, profile entropy is a fundamental measure unifying the concepts of estimation, inference, and compression.
Specifically, profile entropy a) determines the speed of estimating the distribution relative to the best natural estimator; b) characterizes the rate of inferring all symmetric properties compared with the best estimator over any label-invariant distribution collection; c) serves as the limit of profile compression.
arXiv Detail & Related papers (2020-02-26T17:49:04Z) - Communication-Efficient Distributed Estimator for Generalized Linear
Models with a Diverging Number of Covariates [7.427903819459701]
A novel method is proposed to obtain anally efficient estimator for large-scale distributed data by two rounds of communication.
In this novel method, the assumption on the number of servers is more relaxed and thus practical for real-world applications.
arXiv Detail & Related papers (2020-01-17T08:51:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.