The Evolution of the Interplay Between Input Distributions and Linear
Regions in Networks
- URL: http://arxiv.org/abs/2310.18725v2
- Date: Tue, 7 Nov 2023 04:44:14 GMT
- Title: The Evolution of the Interplay Between Input Distributions and Linear
Regions in Networks
- Authors: Xuan Qi, Yi Wei
- Abstract summary: We count the number of linear convex regions in deep neural networks based on ReLU.
In particular, we prove that for any one-dimensional input, there exists a minimum threshold for the number of neurons required to express it.
We also unveil the iterative refinement process of decision boundaries in ReLU networks during training.
- Score: 20.97553518108504
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It is commonly recognized that the expressiveness of deep neural networks is
contingent upon a range of factors, encompassing their depth, width, and other
relevant considerations. Currently, the practical performance of the majority
of deep neural networks remains uncertain. For ReLU (Rectified Linear Unit)
networks with piecewise linear activations, the number of linear convex regions
serves as a natural metric to gauge the network's expressivity. In this paper,
we count the number of linear convex regions in deep neural networks based on
ReLU. In particular, we prove that for any one-dimensional input, there exists
a minimum threshold for the number of neurons required to express it. We also
empirically observe that for the same network, intricate inputs hinder its
capacity to express linear regions. Furthermore, we unveil the iterative
refinement process of decision boundaries in ReLU networks during training. We
aspire for our research to serve as an inspiration for network optimization
endeavors and aids in the exploration and analysis of the behaviors exhibited
by deep networks.
Related papers
- Riemannian Residual Neural Networks [58.925132597945634]
We show how to extend the residual neural network (ResNet)
ResNets have become ubiquitous in machine learning due to their beneficial learning properties, excellent empirical results, and easy-to-incorporate nature when building varied neural networks.
arXiv Detail & Related papers (2023-10-16T02:12:32Z) - Understanding Deep Neural Networks via Linear Separability of Hidden
Layers [68.23950220548417]
We first propose Minkowski difference based linear separability measures (MD-LSMs) to evaluate the linear separability degree of two points sets.
We demonstrate that there is a synchronicity between the linear separability degree of hidden layer outputs and the network training performance.
arXiv Detail & Related papers (2023-07-26T05:29:29Z) - Addressing caveats of neural persistence with deep graph persistence [54.424983583720675]
We find that the variance of network weights and spatial concentration of large weights are the main factors that impact neural persistence.
We propose an extension of the filtration underlying neural persistence to the whole neural network instead of single layers.
This yields our deep graph persistence measure, which implicitly incorporates persistent paths through the network and alleviates variance-related issues.
arXiv Detail & Related papers (2023-07-20T13:34:11Z) - Sparsity-depth Tradeoff in Infinitely Wide Deep Neural Networks [22.083873334272027]
We observe that sparser networks outperform the non-sparse networks at shallow depths on a variety of datasets.
We extend the existing theory on the generalization error of kernel-ridge regression.
arXiv Detail & Related papers (2023-05-17T20:09:35Z) - When Deep Learning Meets Polyhedral Theory: A Survey [6.899761345257773]
In the past decade, deep became the prevalent methodology for predictive modeling thanks to the remarkable accuracy of deep neural learning.
Meanwhile, the structure of neural networks converged back to simplerwise and linear functions.
arXiv Detail & Related papers (2023-04-29T11:46:53Z) - Gradient Descent in Neural Networks as Sequential Learning in RKBS [63.011641517977644]
We construct an exact power-series representation of the neural network in a finite neighborhood of the initial weights.
We prove that, regardless of width, the training sequence produced by gradient descent can be exactly replicated by regularized sequential learning.
arXiv Detail & Related papers (2023-02-01T03:18:07Z) - Traversing the Local Polytopes of ReLU Neural Networks: A Unified
Approach for Network Verification [6.71092092685492]
neural networks (NNs) with ReLU activation functions have found success in a wide range of applications.
Previous works to examine robustness and to improve interpretability partially exploited the piecewise linear function form of ReLU NNs.
In this paper, we explore the unique topological structure that ReLU NNs create in the input space, identifying the adjacency among the partitioned local polytopes.
arXiv Detail & Related papers (2021-11-17T06:12:39Z) - DISCO Verification: Division of Input Space into COnvex polytopes for
neural network verification [0.0]
The impressive results of modern neural networks partly come from their non linear behaviour.
We propose a method to simplify the verification problem by operating a partitionning into multiple linear subproblems.
We also present the impact of a technique aiming at reducing the number of linear regions during training.
arXiv Detail & Related papers (2021-05-17T12:40:51Z) - Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis.
By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner.
This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z) - Bounding The Number of Linear Regions in Local Area for Neural Networks
with ReLU Activations [6.4817648240626005]
We present the first method to estimate the upper bound of the number of linear regions in any sphere in the input space of a given ReLU neural network.
Our experiments showed that, while training a neural network, the boundaries of the linear regions tend to move away from the training data points.
arXiv Detail & Related papers (2020-07-14T04:06:00Z) - Piecewise linear activations substantially shape the loss surfaces of
neural networks [95.73230376153872]
This paper presents how piecewise linear activation functions substantially shape the loss surfaces of neural networks.
We first prove that it the loss surfaces of many neural networks have infinite spurious local minima which are defined as the local minima with higher empirical risks than the global minima.
For one-hidden-layer networks, we prove that all local minima in a cell constitute an equivalence class; they are concentrated in a valley; and they are all global minima in the cell.
arXiv Detail & Related papers (2020-03-27T04:59:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.