Related papers: Empirical Strategy for Stretching Probability Distribution in Neural-network-based Regression

Empirical Strategy for Stretching Probability Distribution in Neural-network-based Regression

URL: http://arxiv.org/abs/2009.03534v1
Date: Tue, 8 Sep 2020 06:08:14 GMT
Title: Empirical Strategy for Stretching Probability Distribution in Neural-network-based Regression
Authors: Eunho Koo and Hyungjun Kim
Abstract summary: In regression analysis under artificial neural networks, the prediction performance depends on determining the appropriate weights between layers. We proposed weighted empirical stretching (WES) as a novel loss function to increase the overlap area of the two distributions. The improved results in RMSE for the extreme domain are expected to be utilized for prediction of abnormal events in non-linear complex systems.
Score: 5.35308390309106
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In regression analysis under artificial neural networks, the prediction performance depends on determining the appropriate weights between layers. As randomly initialized weights are updated during back-propagation using the gradient descent procedure under a given loss function, the loss function structure can affect the performance significantly. In this study, we considered the distribution error, i.e., the inconsistency of two distributions (those of the predicted values and label), as the prediction error, and proposed weighted empirical stretching (WES) as a novel loss function to increase the overlap area of the two distributions. The function depends on the distribution of a given label, thus, it is applicable to any distribution shape. Moreover, it contains a scaling hyperparameter such that the appropriate parameter value maximizes the common section of the two distributions. To test the function capability, we generated ideal distributed curves (unimodal, skewed unimodal, bimodal, and skewed bimodal) as the labels, and used the Fourier-extracted input data from the curves under a feedforward neural network. In general, WES outperformed loss functions in wide use, and the performance was robust to the various noise levels. The improved results in RMSE for the extreme domain (i.e., both tail regions of the distribution) are expected to be utilized for prediction of abnormal events in non-linear complex systems such as natural disaster and financial crisis.

Related papers

Probabilistic Neural Networks (PNNs) with t-Distributed Outputs: Adaptive Prediction Intervals Beyond Gaussian Assumptions [2.77390041716769]
Probabilistic neural networks (PNNs) produce output distributions, enabling the construction of prediction intervals. We propose t-Distributed Neural Networks (TDistNNs), which generate t-distributed outputs, parameterized by location, scale, and degrees of freedom. We show that TDistNNs consistently produce narrower prediction intervals than Gaussian-based PNNs while maintaining proper coverage.
arXiv Detail & Related papers (2025-03-16T04:47:48Z)
Tukey g-and-h neural network regression for non-Gaussian data [0.0]
We consider the training of a neural network to predict the parameters of a Tukey g-and-h distribution in a regression framework. We show how we can carry out a goodness-of-fit analysis between the predicted distributions and the test data.
arXiv Detail & Related papers (2024-11-12T17:34:38Z)
Fully Heteroscedastic Count Regression with Deep Double Poisson Networks [4.58556584533865]
Deep Double Poisson Network (DDPN) is a novel neural discrete count regression model.<n>We show that DDPN exhibits robust regression properties similar to heteroscedastic Gaussian models.<n>Experiments on diverse datasets demonstrate that DDPN outperforms current baselines in accuracy, calibration, and out-of-distribution detection.
arXiv Detail & Related papers (2024-06-13T16:02:03Z)
Rejection via Learning Density Ratios [50.91522897152437]
Classification with rejection emerges as a learning paradigm which allows models to abstain from making predictions. We propose a different distributional perspective, where we seek to find an idealized data distribution which maximizes a pretrained model's performance. Our framework is tested empirically over clean and noisy datasets.
arXiv Detail & Related papers (2024-05-29T01:32:17Z)
Structured Radial Basis Function Network: Modelling Diversity for Multiple Hypotheses Prediction [51.82628081279621]
Multi-modal regression is important in forecasting nonstationary processes or with a complex mixture of distributions. A Structured Radial Basis Function Network is presented as an ensemble of multiple hypotheses predictors for regression problems. It is proved that this structured model can efficiently interpolate this tessellation and approximate the multiple hypotheses target distribution.
arXiv Detail & Related papers (2023-09-02T01:27:53Z)
Learning Theory of Distribution Regression with Neural Networks [6.961253535504979]
We establish an approximation theory and a learning theory of distribution regression via a fully connected neural network (FNN) In contrast to the classical regression methods, the input variables of distribution regression are probability measures.
arXiv Detail & Related papers (2023-07-07T09:49:11Z)
Robust Gaussian Process Regression with Huber Likelihood [2.7184224088243365]
We propose a robust process model in the Gaussian process framework with the likelihood of observed data expressed as the Huber probability distribution. The proposed model employs weights based on projection statistics to scale residuals and bound the influence of vertical outliers and bad leverage points on the latent functions estimates.
arXiv Detail & Related papers (2023-01-19T02:59:33Z)
Reliable amortized variational inference with physics-based latent distribution correction [0.4588028371034407]
A neural network is trained to approximate the posterior distribution over existing pairs of model and data. The accuracy of this approach relies on the availability of high-fidelity training data. We show that our correction step improves the robustness of amortized variational inference with respect to changes in number of source experiments, noise variance, and shifts in the prior distribution.
arXiv Detail & Related papers (2022-07-24T02:38:54Z)
On the Double Descent of Random Features Models Trained with SGD [78.0918823643911]
We study properties of random features (RF) regression in high dimensions optimized by gradient descent (SGD) We derive precise non-asymptotic error bounds of RF regression under both constant and adaptive step-size SGD setting. We observe the double descent phenomenon both theoretically and empirically.
arXiv Detail & Related papers (2021-10-13T17:47:39Z)
The Interplay Between Implicit Bias and Benign Overfitting in Two-Layer Linear Networks [51.1848572349154]
neural network models that perfectly fit noisy data can generalize well to unseen test data. We consider interpolating two-layer linear neural networks trained with gradient flow on the squared loss and derive bounds on the excess risk.
arXiv Detail & Related papers (2021-08-25T22:01:01Z)
Improving Uncertainty Calibration via Prior Augmented Data [56.88185136509654]
Neural networks have proven successful at learning from complex data distributions by acting as universal function approximators. They are often overconfident in their predictions, which leads to inaccurate and miscalibrated probabilistic predictions. We propose a solution by seeking out regions of feature space where the model is unjustifiably overconfident, and conditionally raising the entropy of those predictions towards that of the prior distribution of the labels.
arXiv Detail & Related papers (2021-02-22T07:02:37Z)
Unifying supervised learning and VAEs -- coverage, systematics and goodness-of-fit in normalizing-flow based neural network models for astro-particle reconstructions [0.0]
Statistical uncertainties, coverage, systematic uncertainties or a goodness-of-fit measure are often not calculated. We show that a KL-divergence objective of the joint distribution of data and labels allows to unify supervised learning and variational autoencoders. We discuss how to calculate coverage probabilities without numerical integration for specific "base-ordered" contours.
arXiv Detail & Related papers (2020-08-13T11:28:57Z)
Unlabelled Data Improves Bayesian Uncertainty Calibration under Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation. We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.