Unifying supervised learning and VAEs -- coverage, systematics and
goodness-of-fit in normalizing-flow based neural network models for
astro-particle reconstructions
- URL: http://arxiv.org/abs/2008.05825v5
- Date: Sun, 14 Jan 2024 14:45:36 GMT
- Title: Unifying supervised learning and VAEs -- coverage, systematics and
goodness-of-fit in normalizing-flow based neural network models for
astro-particle reconstructions
- Authors: Thorsten Gl\"usenkamp
- Abstract summary: Statistical uncertainties, coverage, systematic uncertainties or a goodness-of-fit measure are often not calculated.
We show that a KL-divergence objective of the joint distribution of data and labels allows to unify supervised learning and variational autoencoders.
We discuss how to calculate coverage probabilities without numerical integration for specific "base-ordered" contours.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural-network based predictions of event properties in astro-particle
physics are getting more and more common. However, in many cases the result is
just utilized as a point prediction. Statistical uncertainties, coverage,
systematic uncertainties or a goodness-of-fit measure are often not calculated.
Here we describe a certain choice of training and network architecture that
allows to incorporate all these properties into a single network model. We show
that a KL-divergence objective of the joint distribution of data and labels
allows to unify supervised learning and variational autoencoders (VAEs) under
one umbrella of stochastic variational inference. The unification motivates an
extended supervised learning scheme which allows to calculate a goodness-of-fit
p-value for the neural network model. Conditional normalizing flows amortized
with a neural network are crucial in this construction. We discuss how to
calculate coverage probabilities without numerical integration for specific
"base-ordered" contours that are unique to normalizing flows. Furthermore we
show how systematic uncertainties can be included via effective marginalization
during training. The proposed extended supervised training incorporates (1)
coverage calculation, (2) systematics and (3) a goodness-of-fit measure in a
single machine-learning model. There are in principle no constraints on the
shape of the involved distributions, in fact the machinery works with complex
multi-modal distributions defined on product spaces like $\mathbb{R}^n \times
\mathbb{S}^m$. The coverage calculation, however, requires care in its
interpretation when the distributions are too degenerate. We see great
potential for exploiting this per-event information in event selections or for
fast astronomical alerts which require uncertainty guarantees.
Related papers
- VeriFlow: Modeling Distributions for Neural Network Verification [4.3012765978447565]
Formal verification has emerged as a promising method to ensure the safety and reliability of neural networks.
We propose the VeriFlow architecture as a flow based density model tailored to allow any verification approach to restrict its search to the some data distribution of interest.
arXiv Detail & Related papers (2024-06-20T12:41:39Z) - Flexible Heteroscedastic Count Regression with Deep Double Poisson Networks [4.58556584533865]
We propose the Deep Double Poisson Network (DDPN) to produce accurate, input-conditional uncertainty representations.
DDPN vastly outperforms existing discrete models.
It can be applied to a variety of count regression datasets.
arXiv Detail & Related papers (2024-06-13T16:02:03Z) - Approximation with Random Shallow ReLU Networks with Applications to Model Reference Adaptive Control [0.0]
We show that ReLU networks with randomly generated weights and biases achieve $L_infty$ error of $O(m-1/2)$ with high probability.
We show how the result can be used to get approximations of required accuracy in a model reference adaptive control application.
arXiv Detail & Related papers (2024-03-25T19:39:17Z) - Probabilistic Contrastive Learning for Long-Tailed Visual Recognition [78.70453964041718]
Longtailed distributions frequently emerge in real-world data, where a large number of minority categories contain a limited number of samples.
Recent investigations have revealed that supervised contrastive learning exhibits promising potential in alleviating the data imbalance.
We propose a novel probabilistic contrastive (ProCo) learning algorithm that estimates the data distribution of the samples from each class in the feature space.
arXiv Detail & Related papers (2024-03-11T13:44:49Z) - Random-Set Neural Networks (RS-NN) [4.549947259731147]
We propose a novel Random-Set Neural Network (RS-NN) for classification.
RS-NN predicts belief functions rather than probability vectors over a set of classes.
It encodes the 'epistemic' uncertainty induced in machine learning by limited training sets.
arXiv Detail & Related papers (2023-07-11T20:00:35Z) - Improved uncertainty quantification for neural networks with Bayesian
last layer [0.0]
Uncertainty quantification is an important task in machine learning.
We present a reformulation of the log-marginal likelihood of a NN with BLL which allows for efficient training using backpropagation.
arXiv Detail & Related papers (2023-02-21T20:23:56Z) - Modeling Uncertain Feature Representation for Domain Generalization [49.129544670700525]
We show that our method consistently improves the network generalization ability on multiple vision tasks.
Our methods are simple yet effective and can be readily integrated into networks without additional trainable parameters or loss constraints.
arXiv Detail & Related papers (2023-01-16T14:25:02Z) - Uncertainty Modeling for Out-of-Distribution Generalization [56.957731893992495]
We argue that the feature statistics can be properly manipulated to improve the generalization ability of deep learning models.
Common methods often consider the feature statistics as deterministic values measured from the learned features.
We improve the network generalization ability by modeling the uncertainty of domain shifts with synthesized feature statistics during training.
arXiv Detail & Related papers (2022-02-08T16:09:12Z) - NUQ: Nonparametric Uncertainty Quantification for Deterministic Neural
Networks [151.03112356092575]
We show the principled way to measure the uncertainty of predictions for a classifier based on Nadaraya-Watson's nonparametric estimate of the conditional label distribution.
We demonstrate the strong performance of the method in uncertainty estimation tasks on a variety of real-world image datasets.
arXiv Detail & Related papers (2022-02-07T12:30:45Z) - Improving Uncertainty Calibration via Prior Augmented Data [56.88185136509654]
Neural networks have proven successful at learning from complex data distributions by acting as universal function approximators.
They are often overconfident in their predictions, which leads to inaccurate and miscalibrated probabilistic predictions.
We propose a solution by seeking out regions of feature space where the model is unjustifiably overconfident, and conditionally raising the entropy of those predictions towards that of the prior distribution of the labels.
arXiv Detail & Related papers (2021-02-22T07:02:37Z) - General stochastic separation theorems with optimal bounds [68.8204255655161]
Phenomenon of separability was revealed and used in machine learning to correct errors of Artificial Intelligence (AI) systems and analyze AI instabilities.
Errors or clusters of errors can be separated from the rest of the data.
The ability to correct an AI system also opens up the possibility of an attack on it, and the high dimensionality induces vulnerabilities caused by the same separability.
arXiv Detail & Related papers (2020-10-11T13:12:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.