Related papers: Unifying supervised learning and VAEs -- coverage, systematics and goodness-of-fit in normalizing-flow based neural network models for astro-particle reconstructions

Unifying supervised learning and VAEs -- coverage, systematics and goodness-of-fit in normalizing-flow based neural network models for astro-particle reconstructions

URL: http://arxiv.org/abs/2008.05825v5
Date: Sun, 14 Jan 2024 14:45:36 GMT
Title: Unifying supervised learning and VAEs -- coverage, systematics and goodness-of-fit in normalizing-flow based neural network models for astro-particle reconstructions
Authors: Thorsten Gl\"usenkamp
Abstract summary: Statistical uncertainties, coverage, systematic uncertainties or a goodness-of-fit measure are often not calculated. We show that a KL-divergence objective of the joint distribution of data and labels allows to unify supervised learning and variational autoencoders. We discuss how to calculate coverage probabilities without numerical integration for specific "base-ordered" contours.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Neural-network based predictions of event properties in astro-particle physics are getting more and more common. However, in many cases the result is just utilized as a point prediction. Statistical uncertainties, coverage, systematic uncertainties or a goodness-of-fit measure are often not calculated. Here we describe a certain choice of training and network architecture that allows to incorporate all these properties into a single network model. We show that a KL-divergence objective of the joint distribution of data and labels allows to unify supervised learning and variational autoencoders (VAEs) under one umbrella of stochastic variational inference. The unification motivates an extended supervised learning scheme which allows to calculate a goodness-of-fit p-value for the neural network model. Conditional normalizing flows amortized with a neural network are crucial in this construction. We discuss how to calculate coverage probabilities without numerical integration for specific "base-ordered" contours that are unique to normalizing flows. Furthermore we show how systematic uncertainties can be included via effective marginalization during training. The proposed extended supervised training incorporates (1) coverage calculation, (2) systematics and (3) a goodness-of-fit measure in a single machine-learning model. There are in principle no constraints on the shape of the involved distributions, in fact the machinery works with complex multi-modal distributions defined on product spaces like $\mathbb{R}^n \times \mathbb{S}^m$. The coverage calculation, however, requires care in its interpretation when the distributions are too degenerate. We see great potential for exploiting this per-event information in event selections or for fast astronomical alerts which require uncertainty guarantees.

Related papers

VeriFlow: Modeling Distributions for Neural Network Verification [4.3012765978447565]
Formal verification has emerged as a promising method to ensure the safety and reliability of neural networks. We propose the VeriFlow architecture as a flow based density model tailored to allow any verification approach to restrict its search to the some data distribution of interest.
arXiv Detail & Related papers (2024-06-20T12:41:39Z)
Flexible Heteroscedastic Count Regression with Deep Double Poisson Networks [4.58556584533865]
We propose the Deep Double Poisson Network (DDPN) to produce accurate, input-conditional uncertainty representations. DDPN vastly outperforms existing discrete models. It can be applied to a variety of count regression datasets.
arXiv Detail & Related papers (2024-06-13T16:02:03Z)
Approximation with Random Shallow ReLU Networks with Applications to Model Reference Adaptive Control [0.0]
We show that ReLU networks with randomly generated weights and biases achieve $L_infty$ error of $O(m-1/2)$ with high probability. We show how the result can be used to get approximations of required accuracy in a model reference adaptive control application.
arXiv Detail & Related papers (2024-03-25T19:39:17Z)
Probabilistic Contrastive Learning for Long-Tailed Visual Recognition [78.70453964041718]
Longtailed distributions frequently emerge in real-world data, where a large number of minority categories contain a limited number of samples. Recent investigations have revealed that supervised contrastive learning exhibits promising potential in alleviating the data imbalance. We propose a novel probabilistic contrastive (ProCo) learning algorithm that estimates the data distribution of the samples from each class in the feature space.
arXiv Detail & Related papers (2024-03-11T13:44:49Z)
From Reactive to Proactive Volatility Modeling with Hemisphere Neural Networks [0.0]
We reinvigorate maximum likelihood estimation (MLE) for macroeconomic density forecasting through a novel neural network architecture with dedicated mean and variance hemispheres. Our Hemisphere Neural Network (HNN) provides proactive volatility forecasts based on leading indicators when it can, and reactive volatility based on the magnitude of previous prediction errors when it must.
arXiv Detail & Related papers (2023-11-27T21:37:50Z)
Random-Set Neural Networks (RS-NN) [4.549947259731147]
We propose a novel Random-Set Neural Network (RS-NN) for classification. RS-NN predicts belief functions rather than probability vectors over a set of classes. It encodes the 'epistemic' uncertainty induced in machine learning by limited training sets.
arXiv Detail & Related papers (2023-07-11T20:00:35Z)
Improved uncertainty quantification for neural networks with Bayesian last layer [0.0]
Uncertainty quantification is an important task in machine learning. We present a reformulation of the log-marginal likelihood of a NN with BLL which allows for efficient training using backpropagation.
arXiv Detail & Related papers (2023-02-21T20:23:56Z)
Modeling Uncertain Feature Representation for Domain Generalization [49.129544670700525]
We show that our method consistently improves the network generalization ability on multiple vision tasks. Our methods are simple yet effective and can be readily integrated into networks without additional trainable parameters or loss constraints.
arXiv Detail & Related papers (2023-01-16T14:25:02Z)
Variational Bayes Deep Operator Network: A data-driven Bayesian solver for parametric differential equations [0.0]
We propose Variational Bayes DeepONet (VB-DeepONet) for operator learning. VB-DeepONet uses variational inference to take into account high dimensional posterior distributions.
arXiv Detail & Related papers (2022-06-12T04:20:11Z)
Uncertainty Modeling for Out-of-Distribution Generalization [56.957731893992495]
We argue that the feature statistics can be properly manipulated to improve the generalization ability of deep learning models. Common methods often consider the feature statistics as deterministic values measured from the learned features. We improve the network generalization ability by modeling the uncertainty of domain shifts with synthesized feature statistics during training.
arXiv Detail & Related papers (2022-02-08T16:09:12Z)
NUQ: Nonparametric Uncertainty Quantification for Deterministic Neural Networks [151.03112356092575]
We show the principled way to measure the uncertainty of predictions for a classifier based on Nadaraya-Watson's nonparametric estimate of the conditional label distribution. We demonstrate the strong performance of the method in uncertainty estimation tasks on a variety of real-world image datasets.
arXiv Detail & Related papers (2022-02-07T12:30:45Z)
Improving Uncertainty Calibration via Prior Augmented Data [56.88185136509654]
Neural networks have proven successful at learning from complex data distributions by acting as universal function approximators. They are often overconfident in their predictions, which leads to inaccurate and miscalibrated probabilistic predictions. We propose a solution by seeking out regions of feature space where the model is unjustifiably overconfident, and conditionally raising the entropy of those predictions towards that of the prior distribution of the labels.
arXiv Detail & Related papers (2021-02-22T07:02:37Z)
General stochastic separation theorems with optimal bounds [68.8204255655161]
Phenomenon of separability was revealed and used in machine learning to correct errors of Artificial Intelligence (AI) systems and analyze AI instabilities. Errors or clusters of errors can be separated from the rest of the data. The ability to correct an AI system also opens up the possibility of an attack on it, and the high dimensionality induces vulnerabilities caused by the same separability.
arXiv Detail & Related papers (2020-10-11T13:12:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.