On Size-Independent Sample Complexity of ReLU Networks
- URL: http://arxiv.org/abs/2306.01992v3
- Date: Sun, 4 Feb 2024 19:12:16 GMT
- Title: On Size-Independent Sample Complexity of ReLU Networks
- Authors: Mark Sellke
- Abstract summary: We study the sample complexity of learning ReLU neural networks from the point of view of generalization.
We estimate the Rademacher complexity of the associated function class.
- Score: 9.15749739027059
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the sample complexity of learning ReLU neural networks from the
point of view of generalization. Given norm constraints on the weight matrices,
a common approach is to estimate the Rademacher complexity of the associated
function class. Previously Golowich-Rakhlin-Shamir (2020) obtained a bound
independent of the network size (scaling with a product of Frobenius norms)
except for a factor of the square-root depth. We give a refinement which often
has no explicit depth-dependence at all.
Related papers
- A Margin-based Multiclass Generalization Bound via Geometric Complexity [6.554326244334867]
We investigate margin-based multiclass generalization bounds for neural networks.
We derive a new upper bound on the generalization error which scales with the margin-normalized geometric complexity of the network.
arXiv Detail & Related papers (2024-05-28T21:08:58Z) - Generalization of Scaled Deep ResNets in the Mean-Field Regime [55.77054255101667]
We investigate emphscaled ResNet in the limit of infinitely deep and wide neural networks.
Our results offer new insights into the generalization ability of deep ResNet beyond the lazy training regime.
arXiv Detail & Related papers (2024-03-14T21:48:00Z) - Depth Separation in Norm-Bounded Infinite-Width Neural Networks [55.21840159087921]
We study depth separation in infinite-width neural networks, where complexity is controlled by the overall squared $ell$-norm of the weights.
We show that there are functions that are learnable with sample complexity in the input dimension by norm-controlled depth-3 ReLU networks, yet are not learnable with sub-exponential sample complexity by norm-controlled depth-2 ReLU networks.
arXiv Detail & Related papers (2024-02-13T21:26:38Z) - Polynomial-Time Solutions for ReLU Network Training: A Complexity
Classification via Max-Cut and Zonotopes [70.52097560486683]
We prove that the hardness of approximation of ReLU networks not only mirrors the complexity of the Max-Cut problem but also, in certain special cases, exactly corresponds to it.
In particular, when $epsilonleqsqrt84/83-1approx 0.006$, we show that it is NP-hard to find an approximate global dataset of the ReLU network objective with relative error $epsilon$ with respect to the objective value.
arXiv Detail & Related papers (2023-11-18T04:41:07Z) - Generalization and Estimation Error Bounds for Model-based Neural
Networks [78.88759757988761]
We show that the generalization abilities of model-based networks for sparse recovery outperform those of regular ReLU networks.
We derive practical design rules that allow to construct model-based networks with guaranteed high generalization.
arXiv Detail & Related papers (2023-04-19T16:39:44Z) - Expressivity of Shallow and Deep Neural Networks for Polynomial
Approximation [0.0]
We establish an exponential lower bound on the complexity of any shallow network approximating the product function over a general compact domain.
We also demonstrate this lower bound doesn't apply to normalized Lipschitz monomials over the unit cube.
arXiv Detail & Related papers (2023-03-06T23:01:53Z) - Norm-based Generalization Bounds for Compositionally Sparse Neural
Networks [11.987589603961622]
We prove generalization bounds for multilayered sparse ReLU neural networks, including convolutional neural networks.
Taken together, these results suggest that compositional sparsity of the underlying target function is critical to the success of deep neural networks.
arXiv Detail & Related papers (2023-01-28T00:06:22Z) - On Rademacher Complexity-based Generalization Bounds for Deep Learning [18.601449856300984]
We show that the Rademacher complexity-based approach can generate non-vacuous generalisation bounds on Convolutional Neural Networks (CNNs)
Our results show that the Rademacher complexity does not depend on the network length for CNNs with some special types of activation functions such as ReLU, Leaky ReLU, Parametric Rectifier Linear Unit, Sigmoid, and Tanh.
arXiv Detail & Related papers (2022-08-08T17:24:04Z) - Path Regularization: A Convexity and Sparsity Inducing Regularization
for Parallel ReLU Networks [75.33431791218302]
We study the training problem of deep neural networks and introduce an analytic approach to unveil hidden convexity in the optimization landscape.
We consider a deep parallel ReLU network architecture, which also includes standard deep networks and ResNets as its special cases.
arXiv Detail & Related papers (2021-10-18T18:00:36Z) - The Separation Capacity of Random Neural Networks [78.25060223808936]
We show that a sufficiently large two-layer ReLU-network with standard Gaussian weights and uniformly distributed biases can solve this problem with high probability.
We quantify the relevant structure of the data in terms of a novel notion of mutual complexity.
arXiv Detail & Related papers (2021-07-31T10:25:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.