On the relationship between multivariate splines and infinitely-wide
neural networks
- URL: http://arxiv.org/abs/2302.03459v1
- Date: Tue, 7 Feb 2023 13:29:06 GMT
- Title: On the relationship between multivariate splines and infinitely-wide
neural networks
- Authors: Francis Bach (SIERRA)
- Abstract summary: We show that the associated function space is a Sobolev space on a Euclidean ball, with an explicit bound on the norms of derivatives.
This random feature expansion is numerically better behaved than usual random Fourier features, both in theory and practice.
In particular, in one dimension, we compare the associated leverage scores to compare the two random expansions and show a better scaling for the neural network expansion.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider multivariate splines and show that they have a random feature
expansion as infinitely wide neural networks with one-hidden layer and a
homogeneous activation function which is the power of the rectified linear
unit. We show that the associated function space is a Sobolev space on a
Euclidean ball, with an explicit bound on the norms of derivatives. This link
provides a new random feature expansion for multivariate splines that allow
efficient algorithms. This random feature expansion is numerically better
behaved than usual random Fourier features, both in theory and practice. In
particular, in dimension one, we compare the associated leverage scores to
compare the two random expansions and show a better scaling for the neural
network expansion.
Related papers
- Function-Space Optimality of Neural Architectures With Multivariate
Nonlinearities [30.762063524541638]
We prove a representer theorem that states that the solution sets to learning problems posed over Banach spaces are completely characterized by neural architectures with nonlinearities.
Our results shed light on the regularity of functions learned by neural networks trained on data, and provide new theoretical motivation for several architectural choices found in practice.
arXiv Detail & Related papers (2023-10-05T17:13:16Z) - Gaussian random field approximation via Stein's method with applications to wide random neural networks [20.554836643156726]
We develop a novel Gaussian smoothing technique that allows us to transfer a bound in a smoother metric to the $W_$ distance.
We obtain the first bounds on the Gaussian random field approximation of wide random neural networks.
Our bounds are explicitly expressed in terms of the widths of the network and moments of the random weights.
arXiv Detail & Related papers (2023-06-28T15:35:10Z) - Uniform Convergence of Deep Neural Networks with Lipschitz Continuous
Activation Functions and Variable Widths [3.0069322256338906]
We consider deep neural networks with a Lipschitz continuous activation function and with weight matrices of variable widths.
In particular, as convolutional neural networks are special deep neural networks with weight matrices of increasing widths, we put forward conditions on the mask sequence.
The Lipschitz continuity assumption on the activation functions allows us to include in our theory most of commonly used activation functions in applications.
arXiv Detail & Related papers (2023-06-02T17:07:12Z) - Promises and Pitfalls of the Linearized Laplace in Bayesian Optimization [73.80101701431103]
The linearized-Laplace approximation (LLA) has been shown to be effective and efficient in constructing Bayesian neural networks.
We study the usefulness of the LLA in Bayesian optimization and highlight its strong performance and flexibility.
arXiv Detail & Related papers (2023-04-17T14:23:43Z) - Unified Fourier-based Kernel and Nonlinearity Design for Equivariant
Networks on Homogeneous Spaces [52.424621227687894]
We introduce a unified framework for group equivariant networks on homogeneous spaces.
We take advantage of the sparsity of Fourier coefficients of the lifted feature fields.
We show that other methods treating features as the Fourier coefficients in the stabilizer subgroup are special cases of our activation.
arXiv Detail & Related papers (2022-06-16T17:59:01Z) - Graph-adaptive Rectified Linear Unit for Graph Neural Networks [64.92221119723048]
Graph Neural Networks (GNNs) have achieved remarkable success by extending traditional convolution to learning on non-Euclidean data.
We propose Graph-adaptive Rectified Linear Unit (GReLU) which is a new parametric activation function incorporating the neighborhood information in a novel and efficient way.
We conduct comprehensive experiments to show that our plug-and-play GReLU method is efficient and effective given different GNN backbones and various downstream tasks.
arXiv Detail & Related papers (2022-02-13T10:54:59Z) - Sobolev-type embeddings for neural network approximation spaces [5.863264019032882]
We consider neural network approximation spaces that classify functions according to the rate at which they can be approximated.
We prove embedding theorems between these spaces for different values of $p$.
We find that, analogous to the case of classical function spaces, it is possible to trade "smoothness" (i.e., approximation rate) for increased integrability.
arXiv Detail & Related papers (2021-10-28T17:11:38Z) - The Separation Capacity of Random Neural Networks [78.25060223808936]
We show that a sufficiently large two-layer ReLU-network with standard Gaussian weights and uniformly distributed biases can solve this problem with high probability.
We quantify the relevant structure of the data in terms of a novel notion of mutual complexity.
arXiv Detail & Related papers (2021-07-31T10:25:26Z) - On the Number of Linear Functions Composing Deep Neural Network: Towards
a Refined Definition of Neural Networks Complexity [6.252236971703546]
We introduce an equivalence relation among the linear functions composing a piecewise linear function and then count those linear functions relative to that equivalence relation.
Our new complexity measure can clearly distinguish between the two models, is consistent with the classical measure, and increases exponentially with depth.
arXiv Detail & Related papers (2020-10-23T01:46:12Z) - Multipole Graph Neural Operator for Parametric Partial Differential
Equations [57.90284928158383]
One of the main challenges in using deep learning-based methods for simulating physical systems is formulating physics-based data.
We propose a novel multi-level graph neural network framework that captures interaction at all ranges with only linear complexity.
Experiments confirm our multi-graph network learns discretization-invariant solution operators to PDEs and can be evaluated in linear time.
arXiv Detail & Related papers (2020-06-16T21:56:22Z) - Neural Networks are Convex Regularizers: Exact Polynomial-time Convex
Optimization Formulations for Two-layer Networks [70.15611146583068]
We develop exact representations of training two-layer neural networks with rectified linear units (ReLUs)
Our theory utilizes semi-infinite duality and minimum norm regularization.
arXiv Detail & Related papers (2020-02-24T21:32:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.