Related papers: Exact Upper and Lower Bounds for the Output Distribution of Neural Networks with Random Inputs

Exact Upper and Lower Bounds for the Output Distribution of Neural Networks with Random Inputs

URL: http://arxiv.org/abs/2502.11672v2
Date: Tue, 10 Jun 2025 16:47:09 GMT
Title: Exact Upper and Lower Bounds for the Output Distribution of Neural Networks with Random Inputs
Authors: Andrey Kofnov, Daniel Kapla, Ezio Bartocci, Efstathia Bura,
Abstract summary: We derive exact bounds for the cumulative distribution function (cdf) of the output of a neural network (NN) over its entire support.<n>Our method applies to any feedforward NN using continuous monotonic piecewise twice continuously differentiable activation functions.
Score: 1.0499611180329804
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We derive exact upper and lower bounds for the cumulative distribution function (cdf) of the output of a neural network (NN) over its entire support subject to noisy (stochastic) inputs. The upper and lower bounds converge to the true cdf over its domain as the resolution increases. Our method applies to any feedforward NN using continuous monotonic piecewise twice continuously differentiable activation functions (e.g., ReLU, tanh and softmax) and convolutional NNs, which were beyond the scope of competing approaches. The novelty and instrumental tool of our approach is to bound general NNs with ReLU NNs. The ReLU NN-based bounds are then used to derive the upper and lower bounds of the cdf of the NN output. Experiments demonstrate that our method delivers guaranteed bounds of the predictive output distribution over its support, thus providing exact error guarantees, in contrast to competing approaches.

Related papers

Stable Nonconvex-Nonconcave Training via Linear Interpolation [51.668052890249726]
This paper presents a theoretical analysis of linearahead as a principled method for stabilizing (large-scale) neural network training. We argue that instabilities in the optimization process are often caused by the nonmonotonicity of the loss landscape and show how linear can help by leveraging the theory of nonexpansive operators.
arXiv Detail & Related papers (2023-10-20T12:45:12Z)
Benign Overfitting in Deep Neural Networks under Lazy Training [72.28294823115502]
We show that when the data distribution is well-separated, DNNs can achieve Bayes-optimal test error for classification. Our results indicate that interpolating with smoother functions leads to better generalization.
arXiv Detail & Related papers (2023-05-30T19:37:44Z)
Non-asymptotic approximations of Gaussian neural networks via second-order Poincaré inequalities [6.499759302108927]
We investigate the use of second-order Poincar'e inequalities as an alternative approach to establish QCLTs for the NN's output.<n>We show how our approach is effective in establishing QCLTs for the NN's output, though it leads to suboptimal rates of convergence.
arXiv Detail & Related papers (2023-04-08T13:52:10Z)
On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias [50.84569563188485]
We show that gradient flow converges in direction when labels are determined by the sign of a target network with $r$ neurons. Our result may already hold for mild over- parameterization, where the width is $tildemathcalO(r)$ and independent of the sample size.
arXiv Detail & Related papers (2022-05-18T16:57:10Z)
On Feature Learning in Neural Networks with Global Convergence Guarantees [49.870593940818715]
We study the optimization of wide neural networks (NNs) via gradient flow (GF) We show that when the input dimension is no less than the size of the training set, the training loss converges to zero at a linear rate under GF. We also show empirically that, unlike in the Neural Tangent Kernel (NTK) regime, our multi-layer model exhibits feature learning and can achieve better generalization performance than its NTK counterpart.
arXiv Detail & Related papers (2022-04-22T15:56:43Z)
Robust Estimation for Nonparametric Families via Generative Adversarial Networks [92.64483100338724]
We provide a framework for designing Generative Adversarial Networks (GANs) to solve high dimensional robust statistics problems. Our work extend these to robust mean estimation, second moment estimation, and robust linear regression. In terms of techniques, our proposed GAN losses can be viewed as a smoothed and generalized Kolmogorov-Smirnov distance.
arXiv Detail & Related papers (2022-02-02T20:11:33Z)
Global convergence of ResNets: From finite to infinite width using linear parameterization [0.0]
We study Residual Networks (ResNets) in which the residual block has linear parametrization while still being nonlinear. In this limit, we prove a local Polyak-Lojasiewicz inequality, retrieving the lazy regime. Our analysis leads to a practical and quantified recipe.
arXiv Detail & Related papers (2021-12-10T13:38:08Z)
Neural Optimization Kernel: Towards Robust Deep Learning [13.147925376013129]
Recent studies show a connection between neural networks (NN) and kernel methods. This paper proposes a novel kernel family named Kernel (NOK) We show that over parameterized deep NN (NOK) can increase the expressive power to reduce empirical risk and reduce the bound generalization at the same time.
arXiv Detail & Related papers (2021-06-11T00:34:55Z)
Infinite-channel deep stable convolutional neural networks [2.7561479348365734]
In this paper, we consider the problem of removing A1 in the general context of deep feed-forward convolutional NNs. We show that the infinite-channel limit of a deep feed-forward convolutional NNs, under suitable scaling, is a process with a stable finite-dimensional distribution.
arXiv Detail & Related papers (2021-02-07T08:12:46Z)
Optimal Rates for Averaged Stochastic Gradient Descent under Neural Tangent Kernel Regime [50.510421854168065]
We show that the averaged gradient descent can achieve the minimax optimal convergence rate. We show that the target function specified by the NTK of a ReLU network can be learned at the optimal convergence rate.
arXiv Detail & Related papers (2020-06-22T14:31:37Z)
Log-Likelihood Ratio Minimizing Flows: Towards Robust and Quantifiable Neural Distribution Alignment [52.02794488304448]
We propose a new distribution alignment method based on a log-likelihood ratio statistic and normalizing flows. We experimentally verify that minimizing the resulting objective results in domain alignment that preserves the local structure of input domains.
arXiv Detail & Related papers (2020-03-26T22:10:04Z)
Stable behaviour of infinitely wide deep neural networks [8.000374471991247]
We consider fully connected feed-forward deep neural networks (NNs) where weights and biases are independent and identically distributed. We show that the infinite wide limit of the NN, under suitable scaling on the weights, is a process whose finite-dimensional distributions are stable distributions.
arXiv Detail & Related papers (2020-03-01T04:07:30Z)
Almost Sure Convergence of Dropout Algorithms for Neural Networks [0.0]
We investigate the convergence and rate of multiplying training algorithms for Neural Networks (NNs) that have been inspired by Dropout (on et al., 2012) This paper presents a probability theoretical proof that for fully-connected stationary NNs with differentiable,Hintly bounded activation functions.
arXiv Detail & Related papers (2020-02-06T13:25:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.