The Geometric Structure of Fully-Connected ReLU Layers
- URL: http://arxiv.org/abs/2310.03482v2
- Date: Wed, 8 Nov 2023 14:48:36 GMT
- Title: The Geometric Structure of Fully-Connected ReLU Layers
- Authors: Jonatan Vallin, Karl Larsson, Mats G. Larson
- Abstract summary: We formalize and interpret the geometric structure of $d$-dimensional fully connected ReLU layers in neural networks.
We provide results on the geometric complexity of the decision boundary generated by such networks, as well as proving that modulo an affine transformation, such a network can only generate $d$ different decision boundaries.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We formalize and interpret the geometric structure of $d$-dimensional fully
connected ReLU layers in neural networks. The parameters of a ReLU layer induce
a natural partition of the input domain, such that the ReLU layer can be
significantly simplified in each sector of the partition. This leads to a
geometric interpretation of a ReLU layer as a projection onto a polyhedral cone
followed by an affine transformation, in line with the description in
[doi:10.48550/arXiv.1905.08922] for convolutional networks with ReLU
activations. Further, this structure facilitates simplified expressions for
preimages of the intersection between partition sectors and hyperplanes, which
is useful when describing decision boundaries in a classification setting. We
investigate this in detail for a feed-forward network with one hidden
ReLU-layer, where we provide results on the geometric complexity of the
decision boundary generated by such networks, as well as proving that modulo an
affine transformation, such a network can only generate $d$ different decision
boundaries. Finally, the effect of adding more layers to the network is
discussed.
Related papers
- Defining Neural Network Architecture through Polytope Structures of Dataset [53.512432492636236]
This paper defines upper and lower bounds for neural network widths, which are informed by the polytope structure of the dataset in question.
We develop an algorithm to investigate a converse situation where the polytope structure of a dataset can be inferred from its corresponding trained neural networks.
It is established that popular datasets such as MNIST, Fashion-MNIST, and CIFAR10 can be efficiently encapsulated using no more than two polytopes with a small number of faces.
arXiv Detail & Related papers (2024-02-04T08:57:42Z) - Data Topology-Dependent Upper Bounds of Neural Network Widths [52.58441144171022]
We first show that a three-layer neural network can be designed to approximate an indicator function over a compact set.
This is then extended to a simplicial complex, deriving width upper bounds based on its topological structure.
We prove the universal approximation property of three-layer ReLU networks using our topological approach.
arXiv Detail & Related papers (2023-05-25T14:17:15Z) - On the Effective Number of Linear Regions in Shallow Univariate ReLU
Networks: Convergence Guarantees and Implicit Bias [50.84569563188485]
We show that gradient flow converges in direction when labels are determined by the sign of a target network with $r$ neurons.
Our result may already hold for mild over- parameterization, where the width is $tildemathcalO(r)$ and independent of the sample size.
arXiv Detail & Related papers (2022-05-18T16:57:10Z) - The Role of Linear Layers in Nonlinear Interpolating Networks [13.25706838589123]
Our framework considers a family of networks of varying depth that all have the same capacity but different implicitly defined representation costs.
The representation cost of a function induced by a neural network architecture is the minimum sum of squared weights needed for the network to represent the function.
Our results show that adding linear layers to a ReLU network yields a representation cost that reflects a complex interplay between the alignment and sparsity of ReLU units.
arXiv Detail & Related papers (2022-02-02T02:33:24Z) - Traversing the Local Polytopes of ReLU Neural Networks: A Unified
Approach for Network Verification [6.71092092685492]
neural networks (NNs) with ReLU activation functions have found success in a wide range of applications.
Previous works to examine robustness and to improve interpretability partially exploited the piecewise linear function form of ReLU NNs.
In this paper, we explore the unique topological structure that ReLU NNs create in the input space, identifying the adjacency among the partitioned local polytopes.
arXiv Detail & Related papers (2021-11-17T06:12:39Z) - Gradient representations in ReLU networks as similarity functions [0.0]
We investigate how the tangent space of the network can be exploited to refine the decision in case of ReLU (Rectified Linear Unit) activations.
arXiv Detail & Related papers (2021-10-26T11:29:10Z) - Clustering-Based Interpretation of Deep ReLU Network [17.234442722611803]
We recognize that the non-linear behavior of the ReLU function gives rise to a natural clustering.
We propose a method to increase the level of interpretability of a fully connected feedforward ReLU neural network.
arXiv Detail & Related papers (2021-10-13T09:24:11Z) - ResNet-LDDMM: Advancing the LDDMM Framework Using Deep Residual Networks [86.37110868126548]
In this work, we make use of deep residual neural networks to solve the non-stationary ODE (flow equation) based on a Euler's discretization scheme.
We illustrate these ideas on diverse registration problems of 3D shapes under complex topology-preserving transformations.
arXiv Detail & Related papers (2021-02-16T04:07:13Z) - Dual-constrained Deep Semi-Supervised Coupled Factorization Network with
Enriched Prior [80.5637175255349]
We propose a new enriched prior based Dual-constrained Deep Semi-Supervised Coupled Factorization Network, called DS2CF-Net.
To ex-tract hidden deep features, DS2CF-Net is modeled as a deep-structure and geometrical structure-constrained neural network.
Our network can obtain state-of-the-art performance for representation learning and clustering.
arXiv Detail & Related papers (2020-09-08T13:10:21Z) - On transversality of bent hyperplane arrangements and the topological
expressiveness of ReLU neural networks [0.0]
We investigate how the architecture of F impacts the geometry and topology of its possible decision regions for binary classification tasks.
We use this obstruction to prove that a decision region of a generic, ReLU network F: Rn -> R with a single hidden layer of dimension can have no more than one bounded connected component.
arXiv Detail & Related papers (2020-08-20T16:06:39Z) - Hierarchical Verification for Adversarial Robustness [89.30150585592648]
We introduce a new framework for the exact point-wise $ell_p$ robustness verification problem.
LayerCert exploits the layer-wise geometric structure of deep feed-forward networks with rectified linear activations (ReLU networks)
We show that LayerCert provably reduces the number and size of the convex programs that one needs to solve compared to GeoCert.
arXiv Detail & Related papers (2020-07-23T07:03:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.