Related papers: Uncertainty propagation in feed-forward neural network models

Uncertainty propagation in feed-forward neural network models

URL: http://arxiv.org/abs/2503.21059v2
Date: Sat, 29 Mar 2025 16:30:59 GMT
Title: Uncertainty propagation in feed-forward neural network models
Authors: Jeremy Diamzon, Daniele Venturi,
Abstract summary: We develop new uncertainty propagation methods for feed-forward neural network architectures.<n>We derive analytical expressions for the probability density function (PDF) of the neural network output.<n>A key finding is that an appropriate linearization of the leaky ReLU activation function yields accurate statistical results.
Score: 3.987067170467799
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We develop new uncertainty propagation methods for feed-forward neural network architectures with leaky ReLU activation functions subject to random perturbations in the input vectors. In particular, we derive analytical expressions for the probability density function (PDF) of the neural network output and its statistical moments as a function of the input uncertainty and the parameters of the network, i.e., weights and biases. A key finding is that an appropriate linearization of the leaky ReLU activation function yields accurate statistical results even for large perturbations in the input vectors. This can be attributed to the way information propagates through the network. We also propose new analytically tractable Gaussian copula surrogate models to approximate the full joint PDF of the neural network output. To validate our theoretical results, we conduct Monte Carlo simulations and a thorough error analysis on a multi-layer neural network representing a nonlinear integro-differential operator between two polynomial function spaces. Our findings demonstrate excellent agreement between the theoretical predictions and Monte Carlo simulations.

Related papers

Deep learning with missing data [3.829599191332801]
We propose Pattern Embedded Neural Networks (PENNs), which can be applied in conjunction with any existing imputation technique. In addition to a neural network trained on the imputed data, PENNs pass the vectors of observation indicators through a second neural network to provide a compact representation. The outputs are then combined in a third neural network to produce final predictions.
arXiv Detail & Related papers (2025-04-21T18:57:36Z)
An Analytic Solution to Covariance Propagation in Neural Networks [10.013553984400488]
This paper presents a sample-free moment propagation technique to accurately characterize the input-output distributions of neural networks. A key enabler of our technique is an analytic solution for the covariance of random variables passed through nonlinear activation functions. The wide applicability and merits of the proposed technique are shown in experiments analyzing the input-output distributions of trained neural networks and training Bayesian neural networks.
arXiv Detail & Related papers (2024-03-24T14:08:24Z)
Instance-wise Linearization of Neural Network for Model Interpretation [13.583425552511704]
The challenge can dive into the non-linear behavior of the neural network. For a neural network model, the non-linear behavior is often caused by non-linear activation units of a model. We propose an instance-wise linearization approach to reformulates the forward computation process of a neural network prediction.
arXiv Detail & Related papers (2023-10-25T02:07:39Z)
Deep Neural Networks Tend To Extrapolate Predictably [51.303814412294514]
neural network predictions tend to be unpredictable and overconfident when faced with out-of-distribution (OOD) inputs. We observe that neural network predictions often tend towards a constant value as input data becomes increasingly OOD. We show how one can leverage our insights in practice to enable risk-sensitive decision-making in the presence of OOD inputs.
arXiv Detail & Related papers (2023-10-02T03:25:32Z)
Learning Theory of Distribution Regression with Neural Networks [6.961253535504979]
We establish an approximation theory and a learning theory of distribution regression via a fully connected neural network (FNN) In contrast to the classical regression methods, the input variables of distribution regression are probability measures.
arXiv Detail & Related papers (2023-07-07T09:49:11Z)
Probabilistic Verification of ReLU Neural Networks via Characteristic Functions [11.489187712465325]
We use ideas from probability theory in the frequency domain to provide probabilistic verification guarantees for ReLU neural networks. We interpret a (deep) feedforward neural network as a discrete dynamical system over a finite horizon. We obtain the corresponding cumulative distribution function of the output set, which can be used to check if the network is performing as expected.
arXiv Detail & Related papers (2022-12-03T05:53:57Z)
Variational Neural Networks [88.24021148516319]
We propose a method for uncertainty estimation in neural networks called Variational Neural Network (VNN) VNN generates parameters for the output distribution of a layer by transforming its inputs with learnable sub-layers. In uncertainty quality estimation experiments, we show that VNNs achieve better uncertainty quality than Monte Carlo Dropout or Bayes By Backpropagation methods.
arXiv Detail & Related papers (2022-07-04T15:41:02Z)
Cardinality-Minimal Explanations for Monotonic Neural Networks [25.212444848632515]
In this paper, we investigate whether tractability can be regained by focusing on neural models implementing a monotonic function. Although the relevant decision problems remain intractable, we can show that they become solvable in favourable time.
arXiv Detail & Related papers (2022-05-19T23:47:25Z)
Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs. By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z)
PAC-Bayesian Learning of Aggregated Binary Activated Neural Networks with Probabilities over Representations [2.047424180164312]
We study the expectation of a probabilistic neural network as a predictor by itself, focusing on the aggregation of binary activated neural networks with normal distributions over real-valued weights. We show that the exact computation remains tractable for deep but narrow neural networks, thanks to a dynamic programming approach.
arXiv Detail & Related papers (2021-10-28T14:11:07Z)
How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks [80.55378250013496]
We study how neural networks trained by gradient descent extrapolate what they learn outside the support of the training distribution. Graph Neural Networks (GNNs) have shown some success in more complex tasks.
arXiv Detail & Related papers (2020-09-24T17:48:59Z)
Provably Efficient Neural Estimation of Structural Equation Model: An Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs) We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent. For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z)
Network Diffusions via Neural Mean-Field Dynamics [52.091487866968286]
We propose a novel learning framework for inference and estimation problems of diffusion on networks. Our framework is derived from the Mori-Zwanzig formalism to obtain an exact evolution of the node infection probabilities. Our approach is versatile and robust to variations of the underlying diffusion network models.
arXiv Detail & Related papers (2020-06-16T18:45:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.