Related papers: A Lower Bound for the Number of Linear Regions of Ternary ReLU Regression Neural Networks

A Lower Bound for the Number of Linear Regions of Ternary ReLU Regression Neural Networks

URL: http://arxiv.org/abs/2507.16079v1
Date: Mon, 21 Jul 2025 21:29:33 GMT
Title: A Lower Bound for the Number of Linear Regions of Ternary ReLU Regression Neural Networks
Authors: Yuta Nakahara, Manabu Kobayashi, Toshiyasu Matsushima,
Abstract summary: We theoretically analyze the expressivity of ternary NNs from the perspective of the number of linear regions.<n>We show that it suffices to either square the width or double the depth of ternary NNs to achieve a lower bound on the maximum number of linear regions comparable to that of general ReLU regression NNs.
Score: 2.048226951354646
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: With the advancement of deep learning, reducing computational complexity and memory consumption has become a critical challenge, and ternary neural networks (NNs) that restrict parameters to $\{-1, 0, +1\}$ have attracted attention as a promising approach. While ternary NNs demonstrate excellent performance in practical applications such as image recognition and natural language processing, their theoretical understanding remains insufficient. In this paper, we theoretically analyze the expressivity of ternary NNs from the perspective of the number of linear regions. Specifically, we evaluate the number of linear regions of ternary regression NNs with Rectified Linear Unit (ReLU) for activation functions and prove that the number of linear regions increases polynomially with respect to network width and exponentially with respect to depth, similar to standard NNs. Moreover, we show that it suffices to either square the width or double the depth of ternary NNs to achieve a lower bound on the maximum number of linear regions comparable to that of general ReLU regression NNs. This provides a theoretical explanation, in some sense, for the practical success of ternary NNs.

Related papers

Neural Network Verification with Branch-and-Bound for General Nonlinearities [63.39918329535165]
Branch-and-bound (BaB) is among the most effective techniques for neural network (NN) verification.<n>We develop a general framework, named GenBaB, to conduct BaB on general nonlinearities to verify NNs with general architectures.<n>Our framework also allows the verification of general nonlinear graphs and enables verification applications beyond simple NNs.
arXiv Detail & Related papers (2024-05-31T17:51:07Z)
Neural Networks for Singular Perturbations [0.0]
We prove expressivity rate bounds for solution sets of a model class of singularly perturbed, elliptic two-point boundary value problems. We establish expression rate bounds in Sobolev norms in terms of the NN size.
arXiv Detail & Related papers (2024-01-12T16:02:18Z)
The Evolution of the Interplay Between Input Distributions and Linear Regions in Networks [20.97553518108504]
We count the number of linear convex regions in deep neural networks based on ReLU. In particular, we prove that for any one-dimensional input, there exists a minimum threshold for the number of neurons required to express it. We also unveil the iterative refinement process of decision boundaries in ReLU networks during training.
arXiv Detail & Related papers (2023-10-28T15:04:53Z)
Gradient Descent in Neural Networks as Sequential Learning in RKBS [63.011641517977644]
We construct an exact power-series representation of the neural network in a finite neighborhood of the initial weights. We prove that, regardless of width, the training sequence produced by gradient descent can be exactly replicated by regularized sequential learning.
arXiv Detail & Related papers (2023-02-01T03:18:07Z)
Learning Low Dimensional State Spaces with Overparameterized Recurrent Neural Nets [57.06026574261203]
We provide theoretical evidence for learning low-dimensional state spaces, which can also model long-term memory. Experiments corroborate our theory, demonstrating extrapolation via learning low-dimensional state spaces with both linear and non-linear RNNs.
arXiv Detail & Related papers (2022-10-25T14:45:15Z)
Extrapolation and Spectral Bias of Neural Nets with Hadamard Product: a Polynomial Net Study [55.12108376616355]
The study on NTK has been devoted to typical neural network architectures, but is incomplete for neural networks with Hadamard products (NNs-Hp) In this work, we derive the finite-width-K formulation for a special class of NNs-Hp, i.e., neural networks. We prove their equivalence to the kernel regression predictor with the associated NTK, which expands the application scope of NTK.
arXiv Detail & Related papers (2022-09-16T06:36:06Z)
On the Number of Regions of Piecewise Linear Neural Networks [16.78532039510369]
Many feedforward neural networks (NNs) generate continuous and piecewise-linear (CPWL) mappings. The number of these so-called linear regions offers a natural metric to characterize the expressiveness of CPWL NNs. We introduce a complementary framework to estimate the average number of linear regions produced by a CPWL NN.
arXiv Detail & Related papers (2022-06-17T08:17:28Z)
On Feature Learning in Neural Networks with Global Convergence Guarantees [49.870593940818715]
We study the optimization of wide neural networks (NNs) via gradient flow (GF) We show that when the input dimension is no less than the size of the training set, the training loss converges to zero at a linear rate under GF. We also show empirically that, unlike in the Neural Tangent Kernel (NTK) regime, our multi-layer model exhibits feature learning and can achieve better generalization performance than its NTK counterpart.
arXiv Detail & Related papers (2022-04-22T15:56:43Z)
A General Computational Framework to Measure the Expressiveness of Complex Networks Using a Tighter Upper Bound of Linear Regions [13.030269373463168]
The upper bound of regions number partitioned by a rec-tifier network, instead of the number itself, is a more practical measurement ofexpressiveness of a DNN. We propose a new and tighter up-per bound regions number for any network structures. Our experiments show our upper bound is tighterthan existing ones, and explain why skip connections and residual structures can improve network performance.
arXiv Detail & Related papers (2020-12-08T14:01:20Z)
Generalization bound of globally optimal non-convex neural network training: Transportation map estimation by infinite dimensional Langevin dynamics [50.83356836818667]
We introduce a new theoretical framework to analyze deep learning optimization with connection to its generalization error. Existing frameworks such as mean field theory and neural tangent kernel theory for neural network optimization analysis typically require taking limit of infinite width of the network to show its global convergence.
arXiv Detail & Related papers (2020-07-11T18:19:50Z)
Modeling from Features: a Mean-field Framework for Over-parameterized Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs) In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit. We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.