Empirical Strategy for Stretching Probability Distribution in
Neural-network-based Regression
- URL: http://arxiv.org/abs/2009.03534v1
- Date: Tue, 8 Sep 2020 06:08:14 GMT
- Title: Empirical Strategy for Stretching Probability Distribution in
Neural-network-based Regression
- Authors: Eunho Koo and Hyungjun Kim
- Abstract summary: In regression analysis under artificial neural networks, the prediction performance depends on determining the appropriate weights between layers.
We proposed weighted empirical stretching (WES) as a novel loss function to increase the overlap area of the two distributions.
The improved results in RMSE for the extreme domain are expected to be utilized for prediction of abnormal events in non-linear complex systems.
- Score: 5.35308390309106
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In regression analysis under artificial neural networks, the prediction
performance depends on determining the appropriate weights between layers. As
randomly initialized weights are updated during back-propagation using the
gradient descent procedure under a given loss function, the loss function
structure can affect the performance significantly. In this study, we
considered the distribution error, i.e., the inconsistency of two distributions
(those of the predicted values and label), as the prediction error, and
proposed weighted empirical stretching (WES) as a novel loss function to
increase the overlap area of the two distributions. The function depends on the
distribution of a given label, thus, it is applicable to any distribution
shape. Moreover, it contains a scaling hyperparameter such that the appropriate
parameter value maximizes the common section of the two distributions. To test
the function capability, we generated ideal distributed curves (unimodal,
skewed unimodal, bimodal, and skewed bimodal) as the labels, and used the
Fourier-extracted input data from the curves under a feedforward neural
network. In general, WES outperformed loss functions in wide use, and the
performance was robust to the various noise levels. The improved results in
RMSE for the extreme domain (i.e., both tail regions of the distribution) are
expected to be utilized for prediction of abnormal events in non-linear complex
systems such as natural disaster and financial crisis.
Related papers
- Rejection via Learning Density Ratios [50.91522897152437]
Classification with rejection emerges as a learning paradigm which allows models to abstain from making predictions.
We propose a different distributional perspective, where we seek to find an idealized data distribution which maximizes a pretrained model's performance.
Our framework is tested empirically over clean and noisy datasets.
arXiv Detail & Related papers (2024-05-29T01:32:17Z) - Structured Radial Basis Function Network: Modelling Diversity for
Multiple Hypotheses Prediction [51.82628081279621]
Multi-modal regression is important in forecasting nonstationary processes or with a complex mixture of distributions.
A Structured Radial Basis Function Network is presented as an ensemble of multiple hypotheses predictors for regression problems.
It is proved that this structured model can efficiently interpolate this tessellation and approximate the multiple hypotheses target distribution.
arXiv Detail & Related papers (2023-09-02T01:27:53Z) - Learning Theory of Distribution Regression with Neural Networks [6.961253535504979]
We establish an approximation theory and a learning theory of distribution regression via a fully connected neural network (FNN)
In contrast to the classical regression methods, the input variables of distribution regression are probability measures.
arXiv Detail & Related papers (2023-07-07T09:49:11Z) - Robust Gaussian Process Regression with Huber Likelihood [2.7184224088243365]
We propose a robust process model in the Gaussian process framework with the likelihood of observed data expressed as the Huber probability distribution.
The proposed model employs weights based on projection statistics to scale residuals and bound the influence of vertical outliers and bad leverage points on the latent functions estimates.
arXiv Detail & Related papers (2023-01-19T02:59:33Z) - Reliable amortized variational inference with physics-based latent
distribution correction [0.4588028371034407]
A neural network is trained to approximate the posterior distribution over existing pairs of model and data.
The accuracy of this approach relies on the availability of high-fidelity training data.
We show that our correction step improves the robustness of amortized variational inference with respect to changes in number of source experiments, noise variance, and shifts in the prior distribution.
arXiv Detail & Related papers (2022-07-24T02:38:54Z) - On the Double Descent of Random Features Models Trained with SGD [78.0918823643911]
We study properties of random features (RF) regression in high dimensions optimized by gradient descent (SGD)
We derive precise non-asymptotic error bounds of RF regression under both constant and adaptive step-size SGD setting.
We observe the double descent phenomenon both theoretically and empirically.
arXiv Detail & Related papers (2021-10-13T17:47:39Z) - The Interplay Between Implicit Bias and Benign Overfitting in Two-Layer
Linear Networks [51.1848572349154]
neural network models that perfectly fit noisy data can generalize well to unseen test data.
We consider interpolating two-layer linear neural networks trained with gradient flow on the squared loss and derive bounds on the excess risk.
arXiv Detail & Related papers (2021-08-25T22:01:01Z) - Improving Uncertainty Calibration via Prior Augmented Data [56.88185136509654]
Neural networks have proven successful at learning from complex data distributions by acting as universal function approximators.
They are often overconfident in their predictions, which leads to inaccurate and miscalibrated probabilistic predictions.
We propose a solution by seeking out regions of feature space where the model is unjustifiably overconfident, and conditionally raising the entropy of those predictions towards that of the prior distribution of the labels.
arXiv Detail & Related papers (2021-02-22T07:02:37Z) - Temporal Action Localization with Variance-Aware Networks [12.364819165688628]
This work addresses the problem of temporal action localization with Variance-Aware Networks (VAN)
VANp is a network that propagates the mean and the variance throughout the network to deliver outputs with second order statistics.
Results show that VANp surpasses the accuracy of virtually all other two-stage networks without involving any additional parameters.
arXiv Detail & Related papers (2020-08-25T20:12:59Z) - Unifying supervised learning and VAEs -- coverage, systematics and
goodness-of-fit in normalizing-flow based neural network models for
astro-particle reconstructions [0.0]
Statistical uncertainties, coverage, systematic uncertainties or a goodness-of-fit measure are often not calculated.
We show that a KL-divergence objective of the joint distribution of data and labels allows to unify supervised learning and variational autoencoders.
We discuss how to calculate coverage probabilities without numerical integration for specific "base-ordered" contours.
arXiv Detail & Related papers (2020-08-13T11:28:57Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.