Robust Estimation for Kernel Exponential Families with Smoothed Total Variation Distances
- URL: http://arxiv.org/abs/2410.20760v1
- Date: Mon, 28 Oct 2024 05:50:47 GMT
- Title: Robust Estimation for Kernel Exponential Families with Smoothed Total Variation Distances
- Authors: Takafumi Kanamori, Kodai Yokoyama, Takayuki Kawashima,
- Abstract summary: In statistical inference, we commonly assume that samples are independent and identically distributed from a probability distribution.
In this paper, we explore the application of GAN-like estimators to a general class of statistical models.
- Score: 2.317910166616341
- License:
- Abstract: In statistical inference, we commonly assume that samples are independent and identically distributed from a probability distribution included in a pre-specified statistical model. However, such an assumption is often violated in practice. Even an unexpected extreme sample called an {\it outlier} can significantly impact classical estimators. Robust statistics studies how to construct reliable statistical methods that efficiently work even when the ideal assumption is violated. Recently, some works revealed that robust estimators such as Tukey's median are well approximated by the generative adversarial net (GAN), a popular learning method for complex generative models using neural networks. GAN is regarded as a learning method using integral probability metrics (IPM), which is a discrepancy measure for probability distributions. In most theoretical analyses of Tukey's median and its GAN-based approximation, however, the Gaussian or elliptical distribution is assumed as the statistical model. In this paper, we explore the application of GAN-like estimators to a general class of statistical models. As the statistical model, we consider the kernel exponential family that includes both finite and infinite-dimensional models. To construct a robust estimator, we propose the smoothed total variation (STV) distance as a class of IPMs. Then, we theoretically investigate the robustness properties of the STV-based estimators. Our analysis reveals that the STV-based estimator is robust against the distribution contamination for the kernel exponential family. Furthermore, we analyze the prediction accuracy of a Monte Carlo approximation method, which circumvents the computational difficulty of the normalization constant.
Related papers
- Statistical Inference in Tensor Completion: Optimal Uncertainty Quantification and Statistical-to-Computational Gaps [7.174572371800217]
This paper presents a simple yet efficient method for statistical inference of tensor linear forms using incomplete and noisy observations.
It is suitable for various statistical inference tasks, including constructing confidence intervals, inference under heteroskedastic and sub-exponential noise, and simultaneous testing.
arXiv Detail & Related papers (2024-10-15T03:09:52Z) - Transformer-based Parameter Estimation in Statistics [0.0]
We propose a transformer-based approach to parameter estimation.
It does not even require knowing the probability density function, which is needed by numerical methods.
It is shown that our approach achieves similar or better accuracy as measured by mean-square-errors.
arXiv Detail & Related papers (2024-02-28T04:30:41Z) - Statistical Efficiency of Score Matching: The View from Isoperimetry [96.65637602827942]
We show a tight connection between statistical efficiency of score matching and the isoperimetric properties of the distribution being estimated.
We formalize these results both in the sample regime and in the finite regime.
arXiv Detail & Related papers (2022-10-03T06:09:01Z) - Robust Estimation for Nonparametric Families via Generative Adversarial
Networks [92.64483100338724]
We provide a framework for designing Generative Adversarial Networks (GANs) to solve high dimensional robust statistics problems.
Our work extend these to robust mean estimation, second moment estimation, and robust linear regression.
In terms of techniques, our proposed GAN losses can be viewed as a smoothed and generalized Kolmogorov-Smirnov distance.
arXiv Detail & Related papers (2022-02-02T20:11:33Z) - Learning to Estimate Without Bias [57.82628598276623]
Gauss theorem states that the weighted least squares estimator is a linear minimum variance unbiased estimation (MVUE) in linear models.
In this paper, we take a first step towards extending this result to non linear settings via deep learning with bias constraints.
A second motivation to BCE is in applications where multiple estimates of the same unknown are averaged for improved performance.
arXiv Detail & Related papers (2021-10-24T10:23:51Z) - Sampling from Arbitrary Functions via PSD Models [55.41644538483948]
We take a two-step approach by first modeling the probability distribution and then sampling from that model.
We show that these models can approximate a large class of densities concisely using few evaluations, and present a simple algorithm to effectively sample from these models.
arXiv Detail & Related papers (2021-10-20T12:25:22Z) - SLOE: A Faster Method for Statistical Inference in High-Dimensional
Logistic Regression [68.66245730450915]
We develop an improved method for debiasing predictions and estimating frequentist uncertainty for practical datasets.
Our main contribution is SLOE, an estimator of the signal strength with convergence guarantees that reduces the computation time of estimation and inference by orders of magnitude.
arXiv Detail & Related papers (2021-03-23T17:48:56Z) - Non-Asymptotic Performance Guarantees for Neural Estimation of
$\mathsf{f}$-Divergences [22.496696555768846]
Statistical distances quantify the dissimilarity between probability distributions.
A modern method for estimating such distances from data relies on parametrizing a variational form by a neural network (NN) and optimizing it.
This paper explores this tradeoff by means of non-asymptotic error bounds, focusing on three popular choices of SDs.
arXiv Detail & Related papers (2021-03-11T19:47:30Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z) - Asymptotic Analysis of Sampling Estimators for Randomized Numerical
Linear Algebra Algorithms [43.134933182911766]
We develop an analysis to derive the distribution of RandNLA sampling estimators for the least-squares problem.
We identify optimal sampling probabilities based on the Asymptotic Mean Squared Error (AMSE) and the Expected Asymptotic Mean Squared Error (EAMSE)
Our theoretical results clarify the role of leverage in the sampling process, and our empirical results demonstrate improvements over existing methods.
arXiv Detail & Related papers (2020-02-24T20:34:50Z) - Maximum likelihood estimation and uncertainty quantification for
Gaussian process approximation of deterministic functions [10.319367855067476]
This article provides one of the first theoretical analyses in the context of Gaussian process regression with a noiseless dataset.
We show that the maximum likelihood estimation of the scale parameter alone provides significant adaptation against misspecification of the Gaussian process model.
arXiv Detail & Related papers (2020-01-29T17:20:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.