Finite-Sample Maximum Likelihood Estimation of Location
- URL: http://arxiv.org/abs/2206.02348v1
- Date: Mon, 6 Jun 2022 04:33:41 GMT
- Title: Finite-Sample Maximum Likelihood Estimation of Location
- Authors: Shivam Gupta, Jasper C.H. Lee, Eric Price, Paul Valiant
- Abstract summary: For fixed $f$ the maximum-likelihood estimate (MLE) is well-known to be optimal in the limit as $n to infty$
We show for arbitrary $f$ and $n$ that one can recover a similar theory based on the Fisher information of a smoothed version of $f$, where the smoothing radius decays with $n$.
- Score: 16.44999338864628
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider 1-dimensional location estimation, where we estimate a parameter
$\lambda$ from $n$ samples $\lambda + \eta_i$, with each $\eta_i$ drawn i.i.d.
from a known distribution $f$. For fixed $f$ the maximum-likelihood estimate
(MLE) is well-known to be optimal in the limit as $n \to \infty$: it is
asymptotically normal with variance matching the Cram\'er-Rao lower bound of
$\frac{1}{n\mathcal{I}}$, where $\mathcal{I}$ is the Fisher information of $f$.
However, this bound does not hold for finite $n$, or when $f$ varies with $n$.
We show for arbitrary $f$ and $n$ that one can recover a similar theory based
on the Fisher information of a smoothed version of $f$, where the smoothing
radius decays with $n$.
Related papers
- Optimal Sketching for Residual Error Estimation for Matrix and Vector Norms [50.15964512954274]
We study the problem of residual error estimation for matrix and vector norms using a linear sketch.
We demonstrate that this gives a substantial advantage empirically, for roughly the same sketch size and accuracy as in previous work.
We also show an $Omega(k2/pn1-2/p)$ lower bound for the sparse recovery problem, which is tight up to a $mathrmpoly(log n)$ factor.
arXiv Detail & Related papers (2024-08-16T02:33:07Z) - Revisiting Step-Size Assumptions in Stochastic Approximation [1.3654846342364308]
The paper revisits step-size selection in a general Markovian setting.
A major conclusion is that the choice of $rho =0$ or even $rho1/2$ is justified only in select settings.
arXiv Detail & Related papers (2024-05-28T05:11:05Z) - On the $O(\frac{\sqrt{d}}{T^{1/4}})$ Convergence Rate of RMSProp and Its Momentum Extension Measured by $\ell_1$ Norm [59.65871549878937]
This paper considers the RMSProp and its momentum extension and establishes the convergence rate of $frac1Tsum_k=1T.
Our convergence rate matches the lower bound with respect to all the coefficients except the dimension $d$.
Our convergence rate can be considered to be analogous to the $frac1Tsum_k=1T.
arXiv Detail & Related papers (2024-02-01T07:21:32Z) - $L^1$ Estimation: On the Optimality of Linear Estimators [64.76492306585168]
This work shows that the only prior distribution on $X$ that induces linearity in the conditional median is Gaussian.
In particular, it is demonstrated that if the conditional distribution $P_X|Y=y$ is symmetric for all $y$, then $X$ must follow a Gaussian distribution.
arXiv Detail & Related papers (2023-09-17T01:45:13Z) - Finite-Sample Symmetric Mean Estimation with Fisher Information Rate [15.802475232604667]
Mean of an unknown variance-$sigma2$ distribution can be estimated from $n$ samples with variance $fracsigma2n$ and nearly corresponding subgaussian rate.
When $f$ is known up to translation, this can be improved to $frac1nmathcal I$, where $mathcal I$ is the Fisher information of the distribution.
arXiv Detail & Related papers (2023-06-28T21:31:46Z) - High-dimensional Location Estimation via Norm Concentration for Subgamma
Vectors [15.802475232604667]
In location estimation, we are given $n$ samples from a known distribution $f$ shifted by an unknown translation $lambda$.
Asymptotically, the maximum likelihood estimate achieves the Cram'er-Rao bound of error $mathcal N(0, frac1nmathcal I)$.
We build on the theory using emphsmoothed estimators to bound the error for finite $n$ in terms of $mathcal I_r$, the Fisher information of the $r$-smoothed
arXiv Detail & Related papers (2023-02-05T22:17:04Z) - A spectral least-squares-type method for heavy-tailed corrupted
regression with unknown covariance \& heterogeneous noise [2.019622939313173]
We revisit heavy-tailed corrupted least-squares linear regression assuming to have a corrupted $n$-sized label-feature sample of at most $epsilon n$ arbitrary outliers.
We propose a near-optimal computationally tractable estimator, based on the power method, assuming no knowledge on $(Sigma,Xi) nor the operator norm of $Xi$.
arXiv Detail & Related papers (2022-09-06T23:37:31Z) - Sparse sketches with small inversion bias [79.77110958547695]
Inversion bias arises when averaging estimates of quantities that depend on the inverse covariance.
We develop a framework for analyzing inversion bias, based on our proposed concept of an $(epsilon,delta)$-unbiased estimator for random matrices.
We show that when the sketching matrix $S$ is dense and has i.i.d. sub-gaussian entries, the estimator $(epsilon,delta)$-unbiased for $(Atop A)-1$ with a sketch of size $m=O(d+sqrt d/
arXiv Detail & Related papers (2020-11-21T01:33:15Z) - Out-of-sample error estimate for robust M-estimators with convex penalty [5.33024001730262]
A generic out-of-sample error estimate is proposed for robust $M$-estimators regularized with a convex penalty.
General differentiable loss functions $psi$ are allowed provided that $psi=rho'$ is 1-Lipschitz.
arXiv Detail & Related papers (2020-08-26T21:50:41Z) - Average Case Column Subset Selection for Entrywise $\ell_1$-Norm Loss [76.02734481158458]
It is known that in the worst case, to obtain a good rank-$k$ approximation to a matrix, one needs an arbitrarily large $nOmega(1)$ number of columns.
We show that under certain minimal and realistic distributional settings, it is possible to obtain a $(k/epsilon)$-approximation with a nearly linear running time and poly$(k/epsilon)+O(klog n)$ columns.
This is the first algorithm of any kind for achieving a $(k/epsilon)$-approximation for entrywise
arXiv Detail & Related papers (2020-04-16T22:57:06Z) - Curse of Dimensionality on Randomized Smoothing for Certifiable
Robustness [151.67113334248464]
We show that extending the smoothing technique to defend against other attack models can be challenging.
We present experimental results on CIFAR to validate our theory.
arXiv Detail & Related papers (2020-02-08T22:02:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.