Improved Bounds on Neural Complexity for Representing Piecewise Linear
Functions
- URL: http://arxiv.org/abs/2210.07236v1
- Date: Thu, 13 Oct 2022 17:58:41 GMT
- Title: Improved Bounds on Neural Complexity for Representing Piecewise Linear
Functions
- Authors: Kuan-Lin Chen, Harinath Garudadri, Bhaskar D. Rao
- Abstract summary: A deep neural network using rectified linear units represents a continuous piecewise linear (CPWL) function.
Recent results estimated that the number of neurons needed to exactly represent any CPWL function grows exponentially with the number of pieces or exponentially in terms of the factorial of the number of distinct linear components.
In this paper, we propose much tighter bounds and establish a time algorithm to find a network satisfying these bounds for any given CPWL function.
- Score: 23.96475985648844
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A deep neural network using rectified linear units represents a continuous
piecewise linear (CPWL) function and vice versa. Recent results in the
literature estimated that the number of neurons needed to exactly represent any
CPWL function grows exponentially with the number of pieces or exponentially in
terms of the factorial of the number of distinct linear components. Moreover,
such growth is amplified linearly with the input dimension. These existing
results seem to indicate that the cost of representing a CPWL function is
expensive. In this paper, we propose much tighter bounds and establish a
polynomial time algorithm to find a network satisfying these bounds for any
given CPWL function. We prove that the number of hidden neurons required to
exactly represent any CPWL function is at most a quadratic function of the
number of pieces. In contrast to all previous results, this upper bound is
invariant to the input dimension. Besides the number of pieces, we also study
the number of distinct linear components in CPWL functions. When such a number
is also given, we prove that the quadratic complexity turns into bilinear,
which implies a lower neural complexity because the number of distinct linear
components is always not greater than the minimum number of pieces in a CPWL
function. When the number of pieces is unknown, we prove that, in terms of the
number of distinct linear components, the neural complexity of any CPWL
function is at most polynomial growth for low-dimensional inputs and a
factorial growth for the worst-case scenario, which are significantly better
than existing results in the literature.
Related papers
- Decomposition Polyhedra of Piecewise Linear Functions [4.594829845106234]
We contribute to the frequently studied question of how to decompose a continuous piecewise linear (CPWL) function into a convex CPWL function.
Every CPWL function has infinitely decompositions, but a difference two convex CPWL functions.
We apply our framework to find decompositions with as few linear pieces as possible.
arXiv Detail & Related papers (2024-10-07T10:48:36Z) - Computational-Statistical Gaps in Gaussian Single-Index Models [77.1473134227844]
Single-Index Models are high-dimensional regression problems with planted structure.
We show that computationally efficient algorithms, both within the Statistical Query (SQ) and the Low-Degree Polynomial (LDP) framework, necessarily require $Omega(dkstar/2)$ samples.
arXiv Detail & Related papers (2024-03-08T18:50:19Z) - Polynomial Width is Sufficient for Set Representation with
High-dimensional Features [69.65698500919869]
DeepSets is the most widely used neural network architecture for set representation.
We present two set-element embedding layers: (a) linear + power activation (LP) and (b) linear + exponential activations (LE)
arXiv Detail & Related papers (2023-07-08T16:00:59Z) - Deep neural network approximation of composite functions without the
curse of dimensionality [0.0]
In this article we identify a class of high-dimensional continuous functions that can be approximated by deep neural networks (DNNs)
The functions in our class can be expressed as a potentially unbounded special functions which include products, maxima, and certain parallelized Lipschitz continuous functions.
arXiv Detail & Related papers (2023-04-12T12:08:59Z) - Expressivity of Shallow and Deep Neural Networks for Polynomial
Approximation [0.0]
We establish an exponential lower bound on the complexity of any shallow network approximating the product function over a general compact domain.
We also demonstrate this lower bound doesn't apply to normalized Lipschitz monomials over the unit cube.
arXiv Detail & Related papers (2023-03-06T23:01:53Z) - Going Beyond Linear RL: Sample Efficient Neural Function Approximation [76.57464214864756]
We study function approximation with two-layer neural networks.
Our results significantly improve upon what can be attained with linear (or eluder dimension) methods.
arXiv Detail & Related papers (2021-07-14T03:03:56Z) - Bilinear Classes: A Structural Framework for Provable Generalization in
RL [119.42509700822484]
Bilinear Classes is a new structural framework which permits generalization in reinforcement learning.
The framework incorporates nearly all existing models in which a sample complexity is achievable.
Our main result provides an RL algorithm which has sample complexity for Bilinear Classes.
arXiv Detail & Related papers (2021-03-19T16:34:20Z) - Neural network approaches to point lattice decoding [6.025026882312586]
Voronoi-reduced basis is introduced to restrict the space of solutions to a binary set.
We count the number of affine pieces in the CPWL decoding function to characterize the complexity of the decoding problem.
arXiv Detail & Related papers (2020-12-13T10:53:34Z) - On Function Approximation in Reinforcement Learning: Optimism in the
Face of Large State Spaces [208.67848059021915]
We study the exploration-exploitation tradeoff at the core of reinforcement learning.
In particular, we prove that the complexity of the function class $mathcalF$ characterizes the complexity of the function.
Our regret bounds are independent of the number of episodes.
arXiv Detail & Related papers (2020-11-09T18:32:22Z) - Measuring Model Complexity of Neural Networks with Curve Activation
Functions [100.98319505253797]
We propose the linear approximation neural network (LANN) to approximate a given deep model with curve activation function.
We experimentally explore the training process of neural networks and detect overfitting.
We find that the $L1$ and $L2$ regularizations suppress the increase of model complexity.
arXiv Detail & Related papers (2020-06-16T07:38:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.