Related papers: The Geometric Structure of Fully-Connected ReLU Layers

The Geometric Structure of Fully-Connected ReLU Layers

URL: http://arxiv.org/abs/2310.03482v2
Date: Wed, 8 Nov 2023 14:48:36 GMT
Title: The Geometric Structure of Fully-Connected ReLU Layers
Authors: Jonatan Vallin, Karl Larsson, Mats G. Larson
Abstract summary: We formalize and interpret the geometric structure of $d$-dimensional fully connected ReLU layers in neural networks. We provide results on the geometric complexity of the decision boundary generated by such networks, as well as proving that modulo an affine transformation, such a network can only generate $d$ different decision boundaries.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We formalize and interpret the geometric structure of $d$-dimensional fully connected ReLU layers in neural networks. The parameters of a ReLU layer induce a natural partition of the input domain, such that the ReLU layer can be significantly simplified in each sector of the partition. This leads to a geometric interpretation of a ReLU layer as a projection onto a polyhedral cone followed by an affine transformation, in line with the description in [doi:10.48550/arXiv.1905.08922] for convolutional networks with ReLU activations. Further, this structure facilitates simplified expressions for preimages of the intersection between partition sectors and hyperplanes, which is useful when describing decision boundaries in a classification setting. We investigate this in detail for a feed-forward network with one hidden ReLU-layer, where we provide results on the geometric complexity of the decision boundary generated by such networks, as well as proving that modulo an affine transformation, such a network can only generate $d$ different decision boundaries. Finally, the effect of adding more layers to the network is discussed.

Related papers

Constraining the outputs of ReLU neural networks [13.645092880691188]
We introduce a class of algebraic varieties naturally associated with ReLU neural networks.<n>By analyzing the rank constraints on the network outputs within each activation region, we derive a structure that characterizes the functions representable by the network.
arXiv Detail & Related papers (2025-08-05T19:30:11Z)
Defining Neural Network Architecture through Polytope Structures of Dataset [53.512432492636236]
This paper defines upper and lower bounds for neural network widths, which are informed by the polytope structure of the dataset in question. We develop an algorithm to investigate a converse situation where the polytope structure of a dataset can be inferred from its corresponding trained neural networks. It is established that popular datasets such as MNIST, Fashion-MNIST, and CIFAR10 can be efficiently encapsulated using no more than two polytopes with a small number of faces.
arXiv Detail & Related papers (2024-02-04T08:57:42Z)
Data Topology-Dependent Upper Bounds of Neural Network Widths [52.58441144171022]
We first show that a three-layer neural network can be designed to approximate an indicator function over a compact set. This is then extended to a simplicial complex, deriving width upper bounds based on its topological structure. We prove the universal approximation property of three-layer ReLU networks using our topological approach.
arXiv Detail & Related papers (2023-05-25T14:17:15Z)
On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias [50.84569563188485]
We show that gradient flow converges in direction when labels are determined by the sign of a target network with $r$ neurons. Our result may already hold for mild over- parameterization, where the width is $tildemathcalO(r)$ and independent of the sample size.
arXiv Detail & Related papers (2022-05-18T16:57:10Z)
The Role of Linear Layers in Nonlinear Interpolating Networks [13.25706838589123]
Our framework considers a family of networks of varying depth that all have the same capacity but different implicitly defined representation costs. The representation cost of a function induced by a neural network architecture is the minimum sum of squared weights needed for the network to represent the function. Our results show that adding linear layers to a ReLU network yields a representation cost that reflects a complex interplay between the alignment and sparsity of ReLU units.
arXiv Detail & Related papers (2022-02-02T02:33:24Z)
Traversing the Local Polytopes of ReLU Neural Networks: A Unified Approach for Network Verification [6.71092092685492]
neural networks (NNs) with ReLU activation functions have found success in a wide range of applications. Previous works to examine robustness and to improve interpretability partially exploited the piecewise linear function form of ReLU NNs. In this paper, we explore the unique topological structure that ReLU NNs create in the input space, identifying the adjacency among the partitioned local polytopes.
arXiv Detail & Related papers (2021-11-17T06:12:39Z)
Gradient representations in ReLU networks as similarity functions [0.0]
We investigate how the tangent space of the network can be exploited to refine the decision in case of ReLU (Rectified Linear Unit) activations.
arXiv Detail & Related papers (2021-10-26T11:29:10Z)
Clustering-Based Interpretation of Deep ReLU Network [17.234442722611803]
We recognize that the non-linear behavior of the ReLU function gives rise to a natural clustering. We propose a method to increase the level of interpretability of a fully connected feedforward ReLU neural network.
arXiv Detail & Related papers (2021-10-13T09:24:11Z)
ResNet-LDDMM: Advancing the LDDMM Framework Using Deep Residual Networks [86.37110868126548]
In this work, we make use of deep residual neural networks to solve the non-stationary ODE (flow equation) based on a Euler's discretization scheme. We illustrate these ideas on diverse registration problems of 3D shapes under complex topology-preserving transformations.
arXiv Detail & Related papers (2021-02-16T04:07:13Z)
Dual-constrained Deep Semi-Supervised Coupled Factorization Network with Enriched Prior [80.5637175255349]
We propose a new enriched prior based Dual-constrained Deep Semi-Supervised Coupled Factorization Network, called DS2CF-Net. To ex-tract hidden deep features, DS2CF-Net is modeled as a deep-structure and geometrical structure-constrained neural network. Our network can obtain state-of-the-art performance for representation learning and clustering.
arXiv Detail & Related papers (2020-09-08T13:10:21Z)
On transversality of bent hyperplane arrangements and the topological expressiveness of ReLU neural networks [0.0]
We investigate how the architecture of F impacts the geometry and topology of its possible decision regions for binary classification tasks. We use this obstruction to prove that a decision region of a generic, ReLU network F: Rn -> R with a single hidden layer of dimension can have no more than one bounded connected component.
arXiv Detail & Related papers (2020-08-20T16:06:39Z)
Hierarchical Verification for Adversarial Robustness [89.30150585592648]
We introduce a new framework for the exact point-wise $ell_p$ robustness verification problem. LayerCert exploits the layer-wise geometric structure of deep feed-forward networks with rectified linear activations (ReLU networks) We show that LayerCert provably reduces the number and size of the convex programs that one needs to solve compared to GeoCert.
arXiv Detail & Related papers (2020-07-23T07:03:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.