A global universality of two-layer neural networks with ReLU activations
- URL: http://arxiv.org/abs/2011.10225v1
- Date: Fri, 20 Nov 2020 05:39:10 GMT
- Title: A global universality of two-layer neural networks with ReLU activations
- Authors: Naoya Hatano, Masahiro Ikeda, Isao Ishikawa, and Yoshihiro Sawano
- Abstract summary: We investigate a universality of neural networks, which concerns a density of the set of two-layer neural networks in a function spaces.
We consider a global convergence by introducing a norm suitably, so that our results will be uniform over any compact set.
- Score: 5.51579574006659
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the present study, we investigate a universality of neural networks, which
concerns a density of the set of two-layer neural networks in a function
spaces. There are many works that handle the convergence over compact sets. In
the present paper, we consider a global convergence by introducing a norm
suitably, so that our results will be uniform over any compact set.
Related papers
- Spectral complexity of deep neural networks [2.099922236065961]
We use the angular power spectrum of the limiting field to characterize the complexity of the network architecture.
On this basis, we classify neural networks as low-disorder, sparse, or high-disorder.
We show how this classification highlights a number of distinct features for standard activation functions, and in particular, sparsity properties of ReLU networks.
arXiv Detail & Related papers (2024-05-15T17:55:05Z) - Data Topology-Dependent Upper Bounds of Neural Network Widths [52.58441144171022]
We first show that a three-layer neural network can be designed to approximate an indicator function over a compact set.
This is then extended to a simplicial complex, deriving width upper bounds based on its topological structure.
We prove the universal approximation property of three-layer ReLU networks using our topological approach.
arXiv Detail & Related papers (2023-05-25T14:17:15Z) - Hybrid Zonotopes Exactly Represent ReLU Neural Networks [0.7100520098029437]
We show that hybrid zonotopes offer an equivalent representation of feed-forward fully connected neural networks with ReLU activation functions.
Our approach demonstrates that the complexity of binary variables is equal to the total number of neurons in the network and hence grows linearly in the size of the network.
arXiv Detail & Related papers (2023-04-05T21:39:00Z) - Exploring the Approximation Capabilities of Multiplicative Neural
Networks for Smooth Functions [9.936974568429173]
We consider two classes of target functions: generalized bandlimited functions and Sobolev-Type balls.
Our results demonstrate that multiplicative neural networks can approximate these functions with significantly fewer layers and neurons.
These findings suggest that multiplicative gates can outperform standard feed-forward layers and have potential for improving neural network design.
arXiv Detail & Related papers (2023-01-11T17:57:33Z) - Neural Networks with Sparse Activation Induced by Large Bias: Tighter Analysis with Bias-Generalized NTK [86.45209429863858]
We study training one-hidden-layer ReLU networks in the neural tangent kernel (NTK) regime.
We show that the neural networks possess a different limiting kernel which we call textitbias-generalized NTK
We also study various properties of the neural networks with this new kernel.
arXiv Detail & Related papers (2023-01-01T02:11:39Z) - On the Effective Number of Linear Regions in Shallow Univariate ReLU
Networks: Convergence Guarantees and Implicit Bias [50.84569563188485]
We show that gradient flow converges in direction when labels are determined by the sign of a target network with $r$ neurons.
Our result may already hold for mild over- parameterization, where the width is $tildemathcalO(r)$ and independent of the sample size.
arXiv Detail & Related papers (2022-05-18T16:57:10Z) - On Feature Learning in Neural Networks with Global Convergence
Guarantees [49.870593940818715]
We study the optimization of wide neural networks (NNs) via gradient flow (GF)
We show that when the input dimension is no less than the size of the training set, the training loss converges to zero at a linear rate under GF.
We also show empirically that, unlike in the Neural Tangent Kernel (NTK) regime, our multi-layer model exhibits feature learning and can achieve better generalization performance than its NTK counterpart.
arXiv Detail & Related papers (2022-04-22T15:56:43Z) - The Sample Complexity of One-Hidden-Layer Neural Networks [57.6421258363243]
We study a class of scalar-valued one-hidden-layer networks, and inputs bounded in Euclidean norm.
We prove that controlling the spectral norm of the hidden layer weight matrix is insufficient to get uniform convergence guarantees.
We analyze two important settings where a mere spectral norm control turns out to be sufficient.
arXiv Detail & Related papers (2022-02-13T07:12:02Z) - LocalDrop: A Hybrid Regularization for Deep Neural Networks [98.30782118441158]
We propose a new approach for the regularization of neural networks by the local Rademacher complexity called LocalDrop.
A new regularization function for both fully-connected networks (FCNs) and convolutional neural networks (CNNs) has been developed based on the proposed upper bound of the local Rademacher complexity.
arXiv Detail & Related papers (2021-03-01T03:10:11Z) - Universal Approximation Power of Deep Residual Neural Networks via
Nonlinear Control Theory [9.210074587720172]
We explain the universal approximation capabilities of deep residual neural networks through geometric nonlinear control.
Inspired by recent work establishing links between residual networks and control systems, we provide a general sufficient condition for a residual network to have the power of universal approximation.
arXiv Detail & Related papers (2020-07-12T14:53:30Z) - A Note on the Global Convergence of Multilayer Neural Networks in the
Mean Field Regime [9.89901717499058]
We introduce a rigorous framework to describe the mean field limit of gradient-based learning dynamics of multilayer neural networks.
We prove a global convergence guarantee for multilayer networks of any depths.
arXiv Detail & Related papers (2020-06-16T17:50:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.