ThriftyNets : Convolutional Neural Networks with Tiny Parameter Budget
- URL: http://arxiv.org/abs/2007.10106v1
- Date: Mon, 20 Jul 2020 13:50:51 GMT
- Title: ThriftyNets : Convolutional Neural Networks with Tiny Parameter Budget
- Authors: Guillaume Coiffier, Ghouthi Boukli Hacene, Vincent Gripon
- Abstract summary: We propose a new convolutional neural network architecture, called ThriftyNet.
In ThriftyNet, only one convolutional layer is defined and used recursively, leading to a maximal parameter factorization.
We achieve competitive performance on a tiny parameters budget, exceeding 91% accuracy on CIFAR-10 with less than 40K parameters in total, and 74.3% on CIFAR-100 with less than 600K parameters.
- Score: 4.438259529250529
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Typical deep convolutional architectures present an increasing number of
feature maps as we go deeper in the network, whereas spatial resolution of
inputs is decreased through downsampling operations. This means that most of
the parameters lay in the final layers, while a large portion of the
computations are performed by a small fraction of the total parameters in the
first layers. In an effort to use every parameter of a network at its maximum,
we propose a new convolutional neural network architecture, called ThriftyNet.
In ThriftyNet, only one convolutional layer is defined and used recursively,
leading to a maximal parameter factorization. In complement, normalization,
non-linearities, downsamplings and shortcut ensure sufficient expressivity of
the model. ThriftyNet achieves competitive performance on a tiny parameters
budget, exceeding 91% accuracy on CIFAR-10 with less than 40K parameters in
total, and 74.3% on CIFAR-100 with less than 600K parameters.
Related papers
- Towards Generalized Entropic Sparsification for Convolutional Neural Networks [0.0]
Convolutional neural networks (CNNs) are reported to be overparametrized.
Here, we introduce a layer-by-layer data-driven pruning method based on the mathematical idea aiming at a computationally-scalable entropic relaxation of the pruning problem.
The sparse subnetwork is found from the pre-trained (full) CNN using the network entropy minimization as a sparsity constraint.
arXiv Detail & Related papers (2024-04-06T21:33:39Z) - Principled Architecture-aware Scaling of Hyperparameters [69.98414153320894]
Training a high-quality deep neural network requires choosing suitable hyperparameters, which is a non-trivial and expensive process.
In this work, we precisely characterize the dependence of initializations and maximal learning rates on the network architecture.
We demonstrate that network rankings can be easily changed by better training networks in benchmarks.
arXiv Detail & Related papers (2024-02-27T11:52:49Z) - Compressing neural network by tensor network with exponentially fewer variational parameters [4.373746415510521]
Neural network (NN) designed for challenging machine learning tasks contains massive variational parameters.
We propose a general compression scheme that significantly reduces variational parameters of NN by encoding them to deep automatically-differentiable tensor network (ADTN)
Our work suggests TN as an exceptionally efficient mathematical structure for representing variational parameters of NN's.
arXiv Detail & Related papers (2023-05-10T11:24:27Z) - Frequency Regularization: Restricting Information Redundancy of
Convolutional Neural Networks [6.387263468033964]
Convolutional neural networks have demonstrated impressive results in many computer vision tasks.
The increasing size of these networks raises concerns about the information overload resulting from the large number of network parameters.
We propose Frequency Regularization to restrict the non-zero elements of the network parameters in the frequency domain.
arXiv Detail & Related papers (2023-04-17T03:32:29Z) - A Directed-Evolution Method for Sparsification and Compression of Neural
Networks with Application to Object Identification and Segmentation and
considerations of optimal quantization using small number of bits [0.0]
This work introduces Directed-Evolution method for sparsification of neural networks.
The relevance of parameters to the network accuracy is directly assessed.
The parameters that produce the least effect on accuracy when tentatively zeroed are indeed zeroed.
arXiv Detail & Related papers (2022-06-12T23:49:08Z) - Modeling from Features: a Mean-field Framework for Over-parameterized
Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs)
In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit.
We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z) - NeuralScale: Efficient Scaling of Neurons for Resource-Constrained Deep
Neural Networks [16.518667634574026]
We search for the neuron (filter) configuration of a fixed network architecture that maximizes accuracy.
We parameterize the change of the neuron (filter) number of each layer with respect to the change in parameters, allowing us to efficiently scale an architecture across arbitrary sizes.
arXiv Detail & Related papers (2020-06-23T08:14:02Z) - Neural Parameter Allocation Search [57.190693718951316]
Training neural networks requires increasing amounts of memory.
Existing methods assume networks have many identical layers and utilize hand-crafted sharing strategies that fail to generalize.
We introduce Neural Allocation Search (NPAS), a novel task where the goal is to train a neural network given an arbitrary, fixed parameter budget.
NPAS covers both low-budget regimes, which produce compact networks, as well as a novel high-budget regime, where additional capacity can be added to boost performance without increasing inference FLOPs.
arXiv Detail & Related papers (2020-06-18T15:01:00Z) - When Residual Learning Meets Dense Aggregation: Rethinking the
Aggregation of Deep Neural Networks [57.0502745301132]
We propose Micro-Dense Nets, a novel architecture with global residual learning and local micro-dense aggregations.
Our micro-dense block can be integrated with neural architecture search based models to boost their performance.
arXiv Detail & Related papers (2020-04-19T08:34:52Z) - Highly Efficient Salient Object Detection with 100K Parameters [137.74898755102387]
We propose a flexible convolutional module, namely generalized OctConv (gOctConv), to efficiently utilize both in-stage and cross-stages multi-scale features.
We build an extremely light-weighted model, namely CSNet, which achieves comparable performance with about 0.2% (100k) of large models on popular object detection benchmarks.
arXiv Detail & Related papers (2020-03-12T07:00:46Z) - Filter Sketch for Network Pruning [184.41079868885265]
We propose a novel network pruning approach by information preserving of pre-trained network weights (filters)
Our approach, referred to as FilterSketch, encodes the second-order information of pre-trained weights.
Experiments on CIFAR-10 show that FilterSketch reduces 63.3% of FLOPs and prunes 59.9% of network parameters with negligible accuracy cost.
arXiv Detail & Related papers (2020-01-23T13:57:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.