Nonparametric Classification on Low Dimensional Manifolds using
Overparameterized Convolutional Residual Networks
- URL: http://arxiv.org/abs/2307.01649v2
- Date: Sun, 18 Feb 2024 03:29:20 GMT
- Title: Nonparametric Classification on Low Dimensional Manifolds using
Overparameterized Convolutional Residual Networks
- Authors: Kaiqi Zhang, Zixuan Zhang, Minshuo Chen, Yuma Takeda, Mengdi Wang, Tuo
Zhao, Yu-Xiang Wang
- Abstract summary: We study the performance of ConvResNeXts, trained with weight decay from the perspective of nonparametric classification.
Our analysis allows for infinitely many building blocks in ConvResNeXts, and shows that weight decay implicitly enforces sparsity on these blocks.
- Score: 82.03459331544737
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Convolutional residual neural networks (ConvResNets), though
overparameterized, can achieve remarkable prediction performance in practice,
which cannot be well explained by conventional wisdom. To bridge this gap, we
study the performance of ConvResNeXts, which cover ConvResNets as a special
case, trained with weight decay from the perspective of nonparametric
classification. Our analysis allows for infinitely many building blocks in
ConvResNeXts, and shows that weight decay implicitly enforces sparsity on these
blocks. Specifically, we consider a smooth target function supported on a
low-dimensional manifold, then prove that ConvResNeXts can adapt to the
function smoothness and low-dimensional structures and efficiently learn the
function without suffering from the curse of dimensionality. Our findings
partially justify the advantage of overparameterized ConvResNeXts over
conventional machine learning models.
Related papers
- Convolutional Neural Network Compression via Dynamic Parameter Rank
Pruning [4.7027290803102675]
We propose an efficient training method for CNN compression via dynamic parameter rank pruning.
Our experiments show that the proposed method can yield substantial storage savings while maintaining or even enhancing classification performance.
arXiv Detail & Related papers (2024-01-15T23:52:35Z) - Scalable Neural Network Kernels [22.299704296356836]
We introduce scalable neural network kernels (SNNKs), capable of approximating regular feedforward layers (FFLs)
We also introduce the neural network bundling process that applies SNNKs to compactify deep neural network architectures.
Our mechanism provides up to 5x reduction in the number of trainable parameters, while maintaining competitive accuracy.
arXiv Detail & Related papers (2023-10-20T02:12:56Z) - Towards Practical Control of Singular Values of Convolutional Layers [65.25070864775793]
Convolutional neural networks (CNNs) are easy to train, but their essential properties, such as generalization error and adversarial robustness, are hard to control.
Recent research demonstrated that singular values of convolutional layers significantly affect such elusive properties.
We offer a principled approach to alleviating constraints of the prior art at the expense of an insignificant reduction in layer expressivity.
arXiv Detail & Related papers (2022-11-24T19:09:44Z) - Benefits of Overparameterized Convolutional Residual Networks: Function
Approximation under Smoothness Constraint [48.25573695787407]
We prove that large ConvResNets can not only approximate a target function in terms of function value, but also exhibit sufficient first-order smoothness.
Our theory partially justifies the benefits of using deep and wide networks in practice.
arXiv Detail & Related papers (2022-06-09T15:35:22Z) - Efficient Micro-Structured Weight Unification and Pruning for Neural
Network Compression [56.83861738731913]
Deep Neural Network (DNN) models are essential for practical applications, especially for resource limited devices.
Previous unstructured or structured weight pruning methods can hardly truly accelerate inference.
We propose a generalized weight unification framework at a hardware compatible micro-structured level to achieve high amount of compression and acceleration.
arXiv Detail & Related papers (2021-06-15T17:22:59Z) - Robust Implicit Networks via Non-Euclidean Contractions [63.91638306025768]
Implicit neural networks show improved accuracy and significant reduction in memory consumption.
They can suffer from ill-posedness and convergence instability.
This paper provides a new framework to design well-posed and robust implicit neural networks.
arXiv Detail & Related papers (2021-06-06T18:05:02Z) - A Deeper Look into Convolutions via Pruning [9.89901717499058]
Modern architectures contain a very small number of fully-connected layers, often at the end, after multiple layers of convolutions.
Although this strategy already reduces the number of parameters, most of the convolutions can be eliminated as well, without suffering any loss in recognition performance.
In this work, we use the matrix characteristics based on eigenvalues in addition to the classical weight-based importance assignment approach for pruning to shed light on the internal mechanisms of a widely used family of CNNs.
arXiv Detail & Related papers (2021-02-04T18:55:03Z) - SlimConv: Reducing Channel Redundancy in Convolutional Neural Networks
by Weights Flipping [43.37989928043927]
We design a novel Slim Convolution (SlimConv) module to boost the performance of CNNs by reducing channel redundancies.
SlimConv consists of three main steps: Reconstruct, Transform and Fuse, through which the features are splitted and reorganized in a more efficient way.
We validate the effectiveness of SlimConv by conducting comprehensive experiments on ImageNet, MS2014, Pascal VOC2012 segmentation, and Pascal VOC2007 detection datasets.
arXiv Detail & Related papers (2020-03-16T23:23:10Z) - Widening and Squeezing: Towards Accurate and Efficient QNNs [125.172220129257]
Quantization neural networks (QNNs) are very attractive to the industry because their extremely cheap calculation and storage overhead, but their performance is still worse than that of networks with full-precision parameters.
Most of existing methods aim to enhance performance of QNNs especially binary neural networks by exploiting more effective training techniques.
We address this problem by projecting features in original full-precision networks to high-dimensional quantization features.
arXiv Detail & Related papers (2020-02-03T04:11:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.