On Tail Decay Rate Estimation of Loss Function Distributions
- URL: http://arxiv.org/abs/2306.02807v1
- Date: Mon, 5 Jun 2023 11:58:25 GMT
- Title: On Tail Decay Rate Estimation of Loss Function Distributions
- Authors: Etrit Haxholli, Marco Lorenzi
- Abstract summary: We develop a novel theory for estimating the tails of marginal distributions.
We show that under some regularity conditions, the shape parameter of the marginal distribution is the maximum tail shape parameter of the family of conditional distributions.
- Score: 5.33024001730262
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The study of loss function distributions is critical to characterize a
model's behaviour on a given machine learning problem. For example, while the
quality of a model is commonly determined by the average loss assessed on a
testing set, this quantity does not reflect the existence of the true mean of
the loss distribution. Indeed, the finiteness of the statistical moments of the
loss distribution is related to the thickness of its tails, which are generally
unknown. Since typical cross-validation schemes determine a family of testing
loss distributions conditioned on the training samples, the total loss
distribution must be recovered by marginalizing over the space of training
sets. As we show in this work, the finiteness of the sampling procedure
negatively affects the reliability and efficiency of classical tail estimation
methods from the Extreme Value Theory, such as the Peaks-Over-Threshold
approach. In this work we tackle this issue by developing a novel general
theory for estimating the tails of marginal distributions, when there exists a
large variability between locations of the individual conditional distributions
underlying the marginal. To this end, we demonstrate that under some regularity
conditions, the shape parameter of the marginal distribution is the maximum
tail shape parameter of the family of conditional distributions. We term this
estimation approach as Cross Tail Estimation (CTE). We test cross-tail
estimation in a series of experiments on simulated and real data, showing the
improved robustness and quality of tail estimation as compared to classical
approaches, and providing evidence for the relationship between overfitting and
loss distribution tail thickness.
Related papers
- Theory on Score-Mismatched Diffusion Models and Zero-Shot Conditional Samplers [49.97755400231656]
We present the first performance guarantee with explicit dimensional general score-mismatched diffusion samplers.
We show that score mismatches result in an distributional bias between the target and sampling distributions, proportional to the accumulated mismatch between the target and training distributions.
This result can be directly applied to zero-shot conditional samplers for any conditional model, irrespective of measurement noise.
arXiv Detail & Related papers (2024-10-17T16:42:12Z) - Risk and cross validation in ridge regression with correlated samples [72.59731158970894]
We provide training examples for the in- and out-of-sample risks of ridge regression when the data points have arbitrary correlations.
We further extend our analysis to the case where the test point has non-trivial correlations with the training set, setting often encountered in time series forecasting.
We validate our theory across a variety of high dimensional data.
arXiv Detail & Related papers (2024-08-08T17:27:29Z) - The Implicit Delta Method [61.36121543728134]
In this paper, we propose an alternative, the implicit delta method, which works by infinitesimally regularizing the training loss of uncertainty.
We show that the change in the evaluation due to regularization is consistent for the variance of the evaluation estimator, even when the infinitesimal change is approximated by a finite difference.
arXiv Detail & Related papers (2022-11-11T19:34:17Z) - A Bayesian Semiparametric Method For Estimating Causal Quantile Effects [1.1118668841431563]
We propose a semiparametric conditional distribution regression model that allows inference on any functionals of counterfactual distributions.
We show via simulations that the use of double balancing score for confounding adjustment improves performance over adjusting for any single score alone.
We apply the proposed method to the North Carolina birth weight dataset to analyze the effect of maternal smoking on infant's birth weight.
arXiv Detail & Related papers (2022-11-03T05:15:18Z) - Estimating the Contamination Factor's Distribution in Unsupervised
Anomaly Detection [7.174572371800215]
Anomaly detection methods identify examples that do not follow the expected behaviour.
The proportion of examples marked as anomalies equals the expected proportion of anomalies, called contamination factor.
We introduce a method for estimating the posterior distribution of the contamination factor of a given unlabeled dataset.
arXiv Detail & Related papers (2022-10-19T11:51:25Z) - Reliable amortized variational inference with physics-based latent
distribution correction [0.4588028371034407]
A neural network is trained to approximate the posterior distribution over existing pairs of model and data.
The accuracy of this approach relies on the availability of high-fidelity training data.
We show that our correction step improves the robustness of amortized variational inference with respect to changes in number of source experiments, noise variance, and shifts in the prior distribution.
arXiv Detail & Related papers (2022-07-24T02:38:54Z) - Robust Estimation for Nonparametric Families via Generative Adversarial
Networks [92.64483100338724]
We provide a framework for designing Generative Adversarial Networks (GANs) to solve high dimensional robust statistics problems.
Our work extend these to robust mean estimation, second moment estimation, and robust linear regression.
In terms of techniques, our proposed GAN losses can be viewed as a smoothed and generalized Kolmogorov-Smirnov distance.
arXiv Detail & Related papers (2022-02-02T20:11:33Z) - Predicting with Confidence on Unseen Distributions [90.68414180153897]
We connect domain adaptation and predictive uncertainty literature to predict model accuracy on challenging unseen distributions.
We find that the difference of confidences (DoC) of a classifier's predictions successfully estimates the classifier's performance change over a variety of shifts.
We specifically investigate the distinction between synthetic and natural distribution shifts and observe that despite its simplicity DoC consistently outperforms other quantifications of distributional difference.
arXiv Detail & Related papers (2021-07-07T15:50:18Z) - The Hidden Uncertainty in a Neural Networks Activations [105.4223982696279]
The distribution of a neural network's latent representations has been successfully used to detect out-of-distribution (OOD) data.
This work investigates whether this distribution correlates with a model's epistemic uncertainty, thus indicating its ability to generalise to novel inputs.
arXiv Detail & Related papers (2020-12-05T17:30:35Z) - Distributionally Robust Parametric Maximum Likelihood Estimation [13.09499764232737]
We propose a distributionally robust maximum likelihood estimator that minimizes the worst-case expected log-loss uniformly over a parametric nominal distribution.
Our novel robust estimator also enjoys statistical consistency and delivers promising empirical results in both regression and classification tasks.
arXiv Detail & Related papers (2020-10-11T19:05:49Z) - Empirical Strategy for Stretching Probability Distribution in
Neural-network-based Regression [5.35308390309106]
In regression analysis under artificial neural networks, the prediction performance depends on determining the appropriate weights between layers.
We proposed weighted empirical stretching (WES) as a novel loss function to increase the overlap area of the two distributions.
The improved results in RMSE for the extreme domain are expected to be utilized for prediction of abnormal events in non-linear complex systems.
arXiv Detail & Related papers (2020-09-08T06:08:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.