Related papers: Deep ReLU Networks Have Surprisingly Simple Polytopes

Deep ReLU Networks Have Surprisingly Simple Polytopes

URL: http://arxiv.org/abs/2305.09145v1
Date: Tue, 16 May 2023 03:51:34 GMT
Title: Deep ReLU Networks Have Surprisingly Simple Polytopes
Authors: Feng-Lei Fan, Wei Huang, Xiangru Zhong, Lecheng Ruan, Tieyong Zeng, Huan Xiong, Fei Wang
Abstract summary: A ReLU network is a piecewise linear function over polytopes. We study the shapes of polytopes via the number of simplices obtained by triangulating the polytope. By characterizing the shape of polytopes, the number of simplices be a leverage for other problems.
Score: 22.969361799006734
License: http://creativecommons.org/licenses/by/4.0/
Abstract: A ReLU network is a piecewise linear function over polytopes. Figuring out the properties of such polytopes is of fundamental importance for the research and development of neural networks. So far, either theoretical or empirical studies on polytopes only stay at the level of counting their number, which is far from a complete characterization of polytopes. To upgrade the characterization to a new level, here we propose to study the shapes of polytopes via the number of simplices obtained by triangulating the polytope. Then, by computing and analyzing the histogram of simplices across polytopes, we find that a ReLU network has relatively simple polytopes under both initialization and gradient descent, although these polytopes theoretically can be rather diverse and complicated. This finding can be appreciated as a novel implicit bias. Next, we use nontrivial combinatorial derivation to theoretically explain why adding depth does not create a more complicated polytope by bounding the average number of faces of polytopes with a function of the dimensionality. Our results concretely reveal what kind of simple functions a network learns and its space partition property. Also, by characterizing the shape of polytopes, the number of simplices be a leverage for other problems, \textit{e.g.}, serving as a generic functional complexity measure to explain the power of popular shortcut networks such as ResNet and analyzing the impact of different regularization strategies on a network's space partition.

Related papers

PolyhedronNet: Representation Learning for Polyhedra with Surface-attributed Graph [4.734024733136093]
PolyhedronNet is a general framework tailored for learning representations of 3D polyhedral objects. Our experiments substantiate PolyhedronNet's efficacy in capturing comprehensive and informative representations of 3D polyhedral objects.
arXiv Detail & Related papers (2025-02-03T20:45:19Z)
Order Theory in the Context of Machine Learning [0.0]
IVNN refers to a network with integer weights but real biases. For every poset with $n$ points, there exists a corresponding order polytope. We show how posets with four points induce neural networks that can be interpreted as $2times 2$ convolutional filters.
arXiv Detail & Related papers (2024-12-08T22:57:41Z)
Defining Neural Network Architecture through Polytope Structures of Dataset [53.512432492636236]
This paper defines upper and lower bounds for neural network widths, which are informed by the polytope structure of the dataset in question. We develop an algorithm to investigate a converse situation where the polytope structure of a dataset can be inferred from its corresponding trained neural networks. It is established that popular datasets such as MNIST, Fashion-MNIST, and CIFAR10 can be efficiently encapsulated using no more than two polytopes with a small number of faces.
arXiv Detail & Related papers (2024-02-04T08:57:42Z)
PolyGNN: Polyhedron-based Graph Neural Network for 3D Building Reconstruction from Point Clouds [22.18061879431175]
PolyGNN is a graph neural network for building reconstruction point clouds. It learns to assemble primitives obtained by polyhedral decomposition. We conduct a transferability analysis across cities and on real-world point clouds.
arXiv Detail & Related papers (2023-07-17T16:52:25Z)
Neural Polytopes [0.0]
We find that simple neural networks with ReLU activation generate polytopes as an approximation of a unit sphere in various dimensions. For a variety of activation functions, generalization of polytopes is obtained, which we call neural polytopes.
arXiv Detail & Related papers (2023-07-03T03:00:22Z)
Polyhedral Complex Extraction from ReLU Networks using Edge Subdivision [0.0]
A neural network consists of piecewise affine building blocks, such as fully-connected layers and ReLU activations. This complex has been previously studied to characterize theoretical properties of neural networks. We propose to subdivide the regions via intersections with hyperplanes induced by each neuron.
arXiv Detail & Related papers (2023-06-12T16:17:04Z)
Data Topology-Dependent Upper Bounds of Neural Network Widths [52.58441144171022]
We first show that a three-layer neural network can be designed to approximate an indicator function over a compact set. This is then extended to a simplicial complex, deriving width upper bounds based on its topological structure. We prove the universal approximation property of three-layer ReLU networks using our topological approach.
arXiv Detail & Related papers (2023-05-25T14:17:15Z)
Lower Bounds on the Depth of Integral ReLU Neural Networks via Lattice Polytopes [3.0079490585515343]
We show that $lceillog_(n)rceil$ hidden layers are indeed necessary to compute the maximum of $n$ numbers. Our results are based on the known duality between neural networks and Newton polytopes via tropical geometry.
arXiv Detail & Related papers (2023-02-24T10:14:53Z)
Towards General-Purpose Representation Learning of Polygonal Geometries [62.34832826705641]
We develop a general-purpose polygon encoding model, which can encode a polygonal geometry into an embedding space. We conduct experiments on two tasks: 1) shape classification based on MNIST; 2) spatial relation prediction based on two new datasets - DBSR-46K and DBSR-cplx46K. Our results show that NUFTspec and ResNet1D outperform multiple existing baselines with significant margins.
arXiv Detail & Related papers (2022-09-29T15:59:23Z)
Algorithmic Determination of the Combinatorial Structure of the Linear Regions of ReLU Neural Networks [0.0]
We determine the regions and facets of all dimensions of the canonical polyhedral complex. We present an algorithm which calculates this full canonical structure. The resulting algorithm is numerically stable, time in the number of intermediate neurons, and obtains accurate information across all dimensions.
arXiv Detail & Related papers (2022-07-15T18:36:12Z)
PolyNet: Polynomial Neural Network for 3D Shape Recognition with PolyShape Representation [51.147664305955495]
3D shape representation and its processing have substantial effects on 3D shape recognition. We propose a deep neural network-based method (PolyNet) and a specific polygon representation (PolyShape) Our experiments demonstrate the strength and the advantages of PolyNet on both 3D shape classification and retrieval tasks.
arXiv Detail & Related papers (2021-10-15T06:45:59Z)
The Separation Capacity of Random Neural Networks [78.25060223808936]
We show that a sufficiently large two-layer ReLU-network with standard Gaussian weights and uniformly distributed biases can solve this problem with high probability. We quantify the relevant structure of the data in terms of a novel notion of mutual complexity.
arXiv Detail & Related papers (2021-07-31T10:25:26Z)
A simple geometric proof for the benefit of depth in ReLU networks [57.815699322370826]
We present a simple proof for the benefit of depth in multi-layer feedforward network with rectified activation ("depth separation") We present a concrete neural network with linear depth (in $m$) and small constant width ($leq 4$) that classifies the problem with zero error.
arXiv Detail & Related papers (2021-01-18T15:40:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.