Piecewise Linear Functions Representable with Infinite Width Shallow
ReLU Neural Networks
- URL: http://arxiv.org/abs/2307.14373v1
- Date: Tue, 25 Jul 2023 15:38:18 GMT
- Title: Piecewise Linear Functions Representable with Infinite Width Shallow
ReLU Neural Networks
- Authors: Sarah McCarty
- Abstract summary: We prove a conjecture of Ongie et al. that every continuous piecewise linear function expressible with this kind of infinite width neural network is expressible as a finite width shallow ReLU neural network.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper analyzes representations of continuous piecewise linear functions
with infinite width, finite cost shallow neural networks using the rectified
linear unit (ReLU) as an activation function. Through its integral
representation, a shallow neural network can be identified by the corresponding
signed, finite measure on an appropriate parameter space. We map these measures
on the parameter space to measures on the projective $n$-sphere cross
$\mathbb{R}$, allowing points in the parameter space to be bijectively mapped
to hyperplanes in the domain of the function. We prove a conjecture of Ongie et
al. that every continuous piecewise linear function expressible with this kind
of infinite width neural network is expressible as a finite width shallow ReLU
neural network.
Related papers
- Shallow ReLU neural networks and finite elements [1.3597551064547502]
We show that piecewise linear functions on a convex polytope mesh can be represented by two-hidden-layer ReLU neural networks in a weak sense.
The numbers of neurons of the two hidden layers required to weakly represent are accurately given based on the numbers of polytopes and hyperplanes involved in this mesh.
arXiv Detail & Related papers (2024-03-09T06:12:06Z) - Data Topology-Dependent Upper Bounds of Neural Network Widths [52.58441144171022]
We first show that a three-layer neural network can be designed to approximate an indicator function over a compact set.
This is then extended to a simplicial complex, deriving width upper bounds based on its topological structure.
We prove the universal approximation property of three-layer ReLU networks using our topological approach.
arXiv Detail & Related papers (2023-05-25T14:17:15Z) - Functional dimension of feedforward ReLU neural networks [0.0]
We show that functional dimension is inhomogeneous across the parameter space of ReLU neural network functions.
We also study the quotient space and fibers of the realization map from parameter space to function space.
arXiv Detail & Related papers (2022-09-08T21:30:16Z) - On the Effective Number of Linear Regions in Shallow Univariate ReLU
Networks: Convergence Guarantees and Implicit Bias [50.84569563188485]
We show that gradient flow converges in direction when labels are determined by the sign of a target network with $r$ neurons.
Our result may already hold for mild over- parameterization, where the width is $tildemathcalO(r)$ and independent of the sample size.
arXiv Detail & Related papers (2022-05-18T16:57:10Z) - Exploring Linear Feature Disentanglement For Neural Networks [63.20827189693117]
Non-linear activation functions, e.g., Sigmoid, ReLU, and Tanh, have achieved great success in neural networks (NNs)
Due to the complex non-linear characteristic of samples, the objective of those activation functions is to project samples from their original feature space to a linear separable feature space.
This phenomenon ignites our interest in exploring whether all features need to be transformed by all non-linear functions in current typical NNs.
arXiv Detail & Related papers (2022-03-22T13:09:17Z) - The Role of Linear Layers in Nonlinear Interpolating Networks [13.25706838589123]
Our framework considers a family of networks of varying depth that all have the same capacity but different implicitly defined representation costs.
The representation cost of a function induced by a neural network architecture is the minimum sum of squared weights needed for the network to represent the function.
Our results show that adding linear layers to a ReLU network yields a representation cost that reflects a complex interplay between the alignment and sparsity of ReLU units.
arXiv Detail & Related papers (2022-02-02T02:33:24Z) - Deep neural network approximation of analytic functions [91.3755431537592]
entropy bound for the spaces of neural networks with piecewise linear activation functions.
We derive an oracle inequality for the expected error of the considered penalized deep neural network estimators.
arXiv Detail & Related papers (2021-04-05T18:02:04Z) - On the Banach spaces associated with multi-layer ReLU networks: Function
representation, approximation theory and gradient descent dynamics [8.160343645537106]
We develop Banach spaces for ReLU neural networks of finite depth $L$ and infinite width.
The spaces contain all finite fully connected $L$-layer networks and their $L2$-limiting objects under on the natural path-norm.
Under this norm, the unit ball in the space for $L$-layer networks has low Rademacher complexity and thus favorable properties.
arXiv Detail & Related papers (2020-07-30T17:47:05Z) - Modeling from Features: a Mean-field Framework for Over-parameterized
Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs)
In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit.
We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z) - No one-hidden-layer neural network can represent multivariable functions [0.0]
In a function approximation with a neural network, an input dataset is mapped to an output index by optimizing the parameters of each hidden-layer unit.
We present constraints on the parameters and its second derivative by constructing a continuum version of a one-hidden-layer neural network with the rectified linear unit (ReLU) activation function.
arXiv Detail & Related papers (2020-06-19T06:46:54Z) - Convex Geometry and Duality of Over-parameterized Neural Networks [70.15611146583068]
We develop a convex analytic approach to analyze finite width two-layer ReLU networks.
We show that an optimal solution to the regularized training problem can be characterized as extreme points of a convex set.
In higher dimensions, we show that the training problem can be cast as a finite dimensional convex problem with infinitely many constraints.
arXiv Detail & Related papers (2020-02-25T23:05:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.