Gaussian random field approximation via Stein's method with applications to wide random neural networks
- URL: http://arxiv.org/abs/2306.16308v2
- Date: Wed, 1 May 2024 01:39:18 GMT
- Title: Gaussian random field approximation via Stein's method with applications to wide random neural networks
- Authors: Krishnakumar Balasubramanian, Larry Goldstein, Nathan Ross, Adil Salim,
- Abstract summary: We develop a novel Gaussian smoothing technique that allows us to transfer a bound in a smoother metric to the $W_$ distance.
We obtain the first bounds on the Gaussian random field approximation of wide random neural networks.
Our bounds are explicitly expressed in terms of the widths of the network and moments of the random weights.
- Score: 20.554836643156726
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We derive upper bounds on the Wasserstein distance ($W_1$), with respect to $\sup$-norm, between any continuous $\mathbb{R}^d$ valued random field indexed by the $n$-sphere and the Gaussian, based on Stein's method. We develop a novel Gaussian smoothing technique that allows us to transfer a bound in a smoother metric to the $W_1$ distance. The smoothing is based on covariance functions constructed using powers of Laplacian operators, designed so that the associated Gaussian process has a tractable Cameron-Martin or Reproducing Kernel Hilbert Space. This feature enables us to move beyond one dimensional interval-based index sets that were previously considered in the literature. Specializing our general result, we obtain the first bounds on the Gaussian random field approximation of wide random neural networks of any depth and Lipschitz activation functions at the random field level. Our bounds are explicitly expressed in terms of the widths of the network and moments of the random weights. We also obtain tighter bounds when the activation function has three bounded derivatives.
Related papers
- Random ReLU Neural Networks as Non-Gaussian Processes [20.607307985674428]
We show that random neural networks with rectified linear unit activation functions are well-defined non-Gaussian processes.
As a by-product, we demonstrate that these networks are solutions to differential equations driven by impulsive white noise.
arXiv Detail & Related papers (2024-05-16T16:28:11Z) - Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks [54.177130905659155]
Recent studies show that a reproducing kernel Hilbert space (RKHS) is not a suitable space to model functions by neural networks.
In this paper, we study a suitable function space for over- parameterized two-layer neural networks with bounded norms.
arXiv Detail & Related papers (2024-04-29T15:04:07Z) - Forward-backward Gaussian variational inference via JKO in the
Bures-Wasserstein Space [19.19325201882727]
Variational inference (VI) seeks to approximate a target distribution $pi$ by an element of a tractable family of distributions.
We develop the Forward-Backward Gaussian Variational Inference (FB-GVI) algorithm to solve Gaussian VI.
For our proposed algorithm, we obtain state-of-the-art convergence guarantees when $pi$ is log-smooth and log-concave.
arXiv Detail & Related papers (2023-04-10T19:49:50Z) - Neural Inference of Gaussian Processes for Time Series Data of Quasars [72.79083473275742]
We introduce a new model that enables it to describe quasar spectra completely.
We also introduce a new method of inference of Gaussian process parameters, which we call $textitNeural Inference$.
The combination of both the CDRW model and Neural Inference significantly outperforms the baseline DRW and MLE.
arXiv Detail & Related papers (2022-11-17T13:01:26Z) - A Law of Robustness beyond Isoperimetry [84.33752026418045]
We prove a Lipschitzness lower bound $Omega(sqrtn/p)$ of robustness of interpolating neural network parameters on arbitrary distributions.
We then show the potential benefit of overparametrization for smooth data when $n=mathrmpoly(d)$.
We disprove the potential existence of an $O(1)$-Lipschitz robust interpolating function when $n=exp(omega(d))$.
arXiv Detail & Related papers (2022-02-23T16:10:23Z) - Gaussian Processes and Statistical Decision-making in Non-Euclidean
Spaces [96.53463532832939]
We develop techniques for broadening the applicability of Gaussian processes.
We introduce a wide class of efficient approximations built from this viewpoint.
We develop a collection of Gaussian process models over non-Euclidean spaces.
arXiv Detail & Related papers (2022-02-22T01:42:57Z) - Large-width functional asymptotics for deep Gaussian neural networks [2.7561479348365734]
We consider fully connected feed-forward deep neural networks where weights and biases are independent and identically distributed according to Gaussian distributions.
Our results contribute to recent theoretical studies on the interplay between infinitely wide deep neural networks and processes.
arXiv Detail & Related papers (2021-02-20T10:14:37Z) - Infinitely Wide Tensor Networks as Gaussian Process [1.7894377200944511]
In this paper, we show the equivalence of the infinitely wide Networks and the Gaussian Process.
We implement the Gaussian Process corresponding to the infinite limit tensor networks and plot the sample paths of these models.
arXiv Detail & Related papers (2021-01-07T02:29:15Z) - Nonparametric approximation of conditional expectation operators [0.3655021726150368]
We investigate the approximation of the $L2$-operator defined by $[Pf](x) := mathbbE[ f(Y) mid X = x ]$ under minimal assumptions.
We prove that $P$ can be arbitrarily well approximated in operator norm by Hilbert-Schmidt operators acting on a reproducing kernel space.
arXiv Detail & Related papers (2020-12-23T19:06:12Z) - A diffusion approach to Stein's method on Riemannian manifolds [65.36007959755302]
We exploit the relationship between the generator of a diffusion on $mathbf M$ with target invariant measure and its characterising Stein operator.
We derive Stein factors, which bound the solution to the Stein equation and its derivatives.
We imply that the bounds for $mathbb Rm$ remain valid when $mathbf M$ is a flat manifold.
arXiv Detail & Related papers (2020-03-25T17:03:58Z) - Complexity of Finding Stationary Points of Nonsmooth Nonconvex Functions [84.49087114959872]
We provide the first non-asymptotic analysis for finding stationary points of nonsmooth, nonsmooth functions.
In particular, we study Hadamard semi-differentiable functions, perhaps the largest class of nonsmooth functions.
arXiv Detail & Related papers (2020-02-10T23:23:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.