Universal Solutions of Feedforward ReLU Networks for Interpolations
- URL: http://arxiv.org/abs/2208.07498v1
- Date: Tue, 16 Aug 2022 02:15:03 GMT
- Title: Universal Solutions of Feedforward ReLU Networks for Interpolations
- Authors: Changcun Huang
- Abstract summary: This paper provides a theoretical framework on the solution of feedforward ReLU networks for generalizations.
To three-layer networks, we classify different kinds of solutions and model them in a normalized form; the solution finding is investigated by three dimensions, including data, networks and the training.
To deep-layer networks, we present a general result called sparse-matrix principle, which could describe some basic behavior of deep layers.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper provides a theoretical framework on the solution of feedforward
ReLU networks for interpolations, in terms of what is called an interpolation
matrix, which is the summary, extension and generalization of our three
preceding works, with the expectation that the solution of engineering could be
included in this framework and finally understood. To three-layer networks, we
classify different kinds of solutions and model them in a normalized form; the
solution finding is investigated by three dimensions, including data, networks
and the training; the mechanism of overparameterization solutions is
interpreted. To deep-layer networks, we present a general result called
sparse-matrix principle, which could describe some basic behavior of deep
layers and explain the phenomenon of the sparse-activation mode that appears in
engineering applications associated with brain science; an advantage of deep
layers compared to shallower ones is manifested in this principle. As
applications, a general solution of deep neural networks for classification is
constructed by that principle; and we also use the principle to study the
data-disentangling property of encoders. Analogous to the three-layer case, the
solution of deep layers is also explored through several dimensions. The
mechanism of multi-output neural networks is explained from the perspective of
interpolation matrices.
Related papers
- On Generalization Bounds for Neural Networks with Low Rank Layers [4.2954245208408866]
We apply Maurer's chain rule for Gaussian complexity to analyze how low-rank layers in deep networks can prevent the accumulation of rank and dimensionality factors.
We compare our results to prior generalization bounds for deep networks, highlighting how deep networks with low-rank layers can achieve better generalization than those with full-rank layers.
arXiv Detail & Related papers (2024-11-20T22:20:47Z) - On the Principles of ReLU Networks with One Hidden Layer [0.0]
It remains unclear how to interpret the mechanism of its solutions obtained by the back-propagation algorithm.
It is shown that, both theoretically and experimentally, the training solution for the one-dimensional input could be completely understood.
arXiv Detail & Related papers (2024-11-11T05:51:11Z) - Conditional computation in neural networks: principles and research trends [48.14569369912931]
This article summarizes principles and ideas from the emerging area of applying textitconditional computation methods to the design of neural networks.
In particular, we focus on neural networks that can dynamically activate or de-activate parts of their computational graph conditionally on their input.
arXiv Detail & Related papers (2024-03-12T11:56:38Z) - Defining Neural Network Architecture through Polytope Structures of Dataset [53.512432492636236]
This paper defines upper and lower bounds for neural network widths, which are informed by the polytope structure of the dataset in question.
We develop an algorithm to investigate a converse situation where the polytope structure of a dataset can be inferred from its corresponding trained neural networks.
It is established that popular datasets such as MNIST, Fashion-MNIST, and CIFAR10 can be efficiently encapsulated using no more than two polytopes with a small number of faces.
arXiv Detail & Related papers (2024-02-04T08:57:42Z) - Variation Spaces for Multi-Output Neural Networks: Insights on Multi-Task Learning and Network Compression [28.851519959657466]
This paper introduces a novel theoretical framework for the analysis of vector-valued neural networks.
A key contribution of this work is the development of a representer theorem for the vector-valued variation spaces.
This observation reveals that the norm associated with these vector-valued variation spaces encourages the learning of features that are useful for multiple tasks.
arXiv Detail & Related papers (2023-05-25T23:32:10Z) - Data Topology-Dependent Upper Bounds of Neural Network Widths [52.58441144171022]
We first show that a three-layer neural network can be designed to approximate an indicator function over a compact set.
This is then extended to a simplicial complex, deriving width upper bounds based on its topological structure.
We prove the universal approximation property of three-layer ReLU networks using our topological approach.
arXiv Detail & Related papers (2023-05-25T14:17:15Z) - Generalization and Estimation Error Bounds for Model-based Neural
Networks [78.88759757988761]
We show that the generalization abilities of model-based networks for sparse recovery outperform those of regular ReLU networks.
We derive practical design rules that allow to construct model-based networks with guaranteed high generalization.
arXiv Detail & Related papers (2023-04-19T16:39:44Z) - Approximation Power of Deep Neural Networks: an explanatory mathematical
survey [0.0]
The goal of this survey is to present an explanatory review of the approximation properties of deep neural networks.
We aim at understanding how and why deep neural networks outperform other classical linear and nonlinear approximation methods.
arXiv Detail & Related papers (2022-07-19T18:47:44Z) - Rank Diminishing in Deep Neural Networks [71.03777954670323]
Rank of neural networks measures information flowing across layers.
It is an instance of a key structural condition that applies across broad domains of machine learning.
For neural networks, however, the intrinsic mechanism that yields low-rank structures remains vague and unclear.
arXiv Detail & Related papers (2022-06-13T12:03:32Z) - Theoretical Exploration of Solutions of Feedforward ReLU networks [0.0]
This paper aims to interpret the mechanism of feedforward ReLU networks by exploring their solutions for piecewise linear functions through basic rules.
We explain three typical network architectures: the subnetwork of last three layers of convolutional networks, multi-layer feedforward networks, and the decoder of autoencoders.
arXiv Detail & Related papers (2022-01-24T01:51:52Z) - A neural anisotropic view of underspecification in deep learning [60.119023683371736]
We show that the way neural networks handle the underspecification of problems is highly dependent on the data representation.
Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
arXiv Detail & Related papers (2021-04-29T14:31:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.