Banach Space Representer Theorems for Neural Networks and Ridge Splines
- URL: http://arxiv.org/abs/2006.05626v3
- Date: Thu, 11 Feb 2021 19:38:46 GMT
- Title: Banach Space Representer Theorems for Neural Networks and Ridge Splines
- Authors: Rahul Parhi and Robert D. Nowak
- Abstract summary: We develop a variational framework to understand the properties of the functions learned by neural networks fit to data.
We derive a representer theorem showing that finite-width, single-hidden layer neural networks are solutions to inverse problems.
- Score: 17.12783792226575
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We develop a variational framework to understand the properties of the
functions learned by neural networks fit to data. We propose and study a family
of continuous-domain linear inverse problems with total variation-like
regularization in the Radon domain subject to data fitting constraints. We
derive a representer theorem showing that finite-width, single-hidden layer
neural networks are solutions to these inverse problems. We draw on many
techniques from variational spline theory and so we propose the notion of
polynomial ridge splines, which correspond to single-hidden layer neural
networks with truncated power functions as the activation function. The
representer theorem is reminiscent of the classical reproducing kernel Hilbert
space representer theorem, but we show that the neural network problem is posed
over a non-Hilbertian Banach space. While the learning problems are posed in
the continuous-domain, similar to kernel methods, the problems can be recast as
finite-dimensional neural network training problems. These neural network
training problems have regularizers which are related to the well-known weight
decay and path-norm regularizers. Thus, our result gives insight into
functional characteristics of trained neural networks and also into the design
neural network regularizers. We also show that these regularizers promote
neural network solutions with desirable generalization properties.
Related papers
- Novel Kernel Models and Exact Representor Theory for Neural Networks Beyond the Over-Parameterized Regime [52.00917519626559]
This paper presents two models of neural-networks and their training applicable to neural networks of arbitrary width, depth and topology.
We also present an exact novel representor theory for layer-wise neural network training with unregularized gradient descent in terms of a local-extrinsic neural kernel (LeNK)
This representor theory gives insight into the role of higher-order statistics in neural network training and the effect of kernel evolution in neural-network kernel models.
arXiv Detail & Related papers (2024-05-24T06:30:36Z) - Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters.
Our approach enables a single model to encode neural computational graphs with diverse architectures.
We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z) - Neural reproducing kernel Banach spaces and representer theorems for
deep networks [16.279502878600184]
We show that deep neural networks define suitable reproducing kernel Banach spaces.
We derive representer theorems that justify the finite architectures commonly employed in applications.
arXiv Detail & Related papers (2024-03-13T17:51:02Z) - Riemannian Residual Neural Networks [58.925132597945634]
We show how to extend the residual neural network (ResNet)
ResNets have become ubiquitous in machine learning due to their beneficial learning properties, excellent empirical results, and easy-to-incorporate nature when building varied neural networks.
arXiv Detail & Related papers (2023-10-16T02:12:32Z) - Permutation Equivariant Neural Functionals [92.0667671999604]
This work studies the design of neural networks that can process the weights or gradients of other neural networks.
We focus on the permutation symmetries that arise in the weights of deep feedforward networks because hidden layer neurons have no inherent order.
In our experiments, we find that permutation equivariant neural functionals are effective on a diverse set of tasks.
arXiv Detail & Related papers (2023-02-27T18:52:38Z) - Gradient Descent in Neural Networks as Sequential Learning in RKBS [63.011641517977644]
We construct an exact power-series representation of the neural network in a finite neighborhood of the initial weights.
We prove that, regardless of width, the training sequence produced by gradient descent can be exactly replicated by regularized sequential learning.
arXiv Detail & Related papers (2023-02-01T03:18:07Z) - Spiking neural network for nonlinear regression [68.8204255655161]
Spiking neural networks carry the potential for a massive reduction in memory and energy consumption.
They introduce temporal and neuronal sparsity, which can be exploited by next-generation neuromorphic hardware.
A framework for regression using spiking neural networks is proposed.
arXiv Detail & Related papers (2022-10-06T13:04:45Z) - Consistency of Neural Networks with Regularization [0.0]
This paper proposes the general framework of neural networks with regularization and prove its consistency.
Two types of activation functions: hyperbolic function(Tanh) and rectified linear unit(ReLU) have been taken into consideration.
arXiv Detail & Related papers (2022-06-22T23:33:39Z) - Optimal Approximation with Sparse Neural Networks and Applications [0.0]
We use deep sparsely connected neural networks to measure the complexity of a function class in $L(mathbb Rd)$.
We also introduce representation system - a countable collection of functions to guide neural networks.
We then analyse the complexity of a class called $beta$ cartoon-like functions using rate-distortion theory and wedgelets construction.
arXiv Detail & Related papers (2021-08-14T05:14:13Z) - What Kinds of Functions do Deep Neural Networks Learn? Insights from
Variational Spline Theory [19.216784367141972]
We develop a variational framework to understand the properties of functions learned by deep neural networks with ReLU activation functions fit to data.
We derive a representer theorem showing that deep ReLU networks are solutions to regularized data fitting problems in this function space.
arXiv Detail & Related papers (2021-05-07T16:18:22Z) - Bidirectionally Self-Normalizing Neural Networks [46.20979546004718]
We provide a rigorous result that shows, under mild conditions, how the vanishing/exploding gradients problem disappears with high probability if the neural networks have sufficient width.
Our main idea is to constrain both forward and backward signal propagation in a nonlinear neural network through a new class of activation functions.
arXiv Detail & Related papers (2020-06-22T12:07:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.