Finding the Optimal Network Depth in Classification Tasks
- URL: http://arxiv.org/abs/2004.08172v1
- Date: Fri, 17 Apr 2020 11:08:45 GMT
- Title: Finding the Optimal Network Depth in Classification Tasks
- Authors: Bartosz W\'ojcik, Maciej Wo{\l}czyk, Klaudia Ba{\l}azy, Jacek Tabor
- Abstract summary: We develop a fast end-to-end method for training lightweight neural networks using multiple classifier heads.
By allowing the model to determine the importance of each head, we are able to detect and remove unneeded components of the network.
- Score: 10.248235276871258
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We develop a fast end-to-end method for training lightweight neural networks
using multiple classifier heads. By allowing the model to determine the
importance of each head and rewarding the choice of a single shallow
classifier, we are able to detect and remove unneeded components of the
network. This operation, which can be seen as finding the optimal depth of the
model, significantly reduces the number of parameters and accelerates inference
across different hardware processing units, which is not the case for many
standard pruning methods. We show the performance of our method on multiple
network architectures and datasets, analyze its optimization properties, and
conduct ablation studies.
Related papers
- Optimizing Sensor Network Design for Multiple Coverage [0.9668407688201359]
We introduce a new objective function for the greedy (next-best-view) algorithm to design efficient and robust sensor networks.
We also introduce a Deep Learning model to accelerate the algorithm for near real-time computations.
arXiv Detail & Related papers (2024-05-15T05:13:20Z) - Fast and Scalable Network Slicing by Integrating Deep Learning with
Lagrangian Methods [8.72339110741777]
Network slicing is a key technique in 5G and beyond for efficiently supporting diverse services.
Deep learning models suffer limited generalization and adaptability to dynamic slicing configurations.
We propose a novel framework that integrates constrained optimization methods and deep learning models.
arXiv Detail & Related papers (2024-01-22T07:19:16Z) - Sensitivity-Aware Mixed-Precision Quantization and Width Optimization of Deep Neural Networks Through Cluster-Based Tree-Structured Parzen Estimation [4.748931281307333]
We introduce an innovative search mechanism for automatically selecting the best bit-width and layer-width for individual neural network layers.
This leads to a marked enhancement in deep neural network efficiency.
arXiv Detail & Related papers (2023-08-12T00:16:51Z) - Dynamic Neural Network for Multi-Task Learning Searching across Diverse
Network Topologies [14.574399133024594]
We present a new MTL framework that searches for optimized structures for multiple tasks with diverse graph topologies.
We design a restricted DAG-based central network with read-in/read-out layers to build topologically diverse task-adaptive structures.
arXiv Detail & Related papers (2023-03-13T05:01:50Z) - Learning to Learn with Generative Models of Neural Network Checkpoints [71.06722933442956]
We construct a dataset of neural network checkpoints and train a generative model on the parameters.
We find that our approach successfully generates parameters for a wide range of loss prompts.
We apply our method to different neural network architectures and tasks in supervised and reinforcement learning.
arXiv Detail & Related papers (2022-09-26T17:59:58Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - Manifold Regularized Dynamic Network Pruning [102.24146031250034]
This paper proposes a new paradigm that dynamically removes redundant filters by embedding the manifold information of all instances into the space of pruned networks.
The effectiveness of the proposed method is verified on several benchmarks, which shows better performance in terms of both accuracy and computational cost.
arXiv Detail & Related papers (2021-03-10T03:59:03Z) - Learning Neural Network Subspaces [74.44457651546728]
Recent observations have advanced our understanding of the neural network optimization landscape.
With a similar computational cost as training one model, we learn lines, curves, and simplexes of high-accuracy neural networks.
With a similar computational cost as training one model, we learn lines, curves, and simplexes of high-accuracy neural networks.
arXiv Detail & Related papers (2021-02-20T23:26:58Z) - Finding Non-Uniform Quantization Schemes using Multi-Task Gaussian
Processes [12.798516310559375]
We show that with significantly lower precision in the last layers we achieve a minimal loss of accuracy with appreciable memory savings.
We test our findings on the CIFAR10 and ImageNet datasets using the VGG, ResNet and GoogLeNet architectures.
arXiv Detail & Related papers (2020-07-15T15:16:18Z) - DC-NAS: Divide-and-Conquer Neural Architecture Search [108.57785531758076]
We present a divide-and-conquer (DC) approach to effectively and efficiently search deep neural architectures.
We achieve a $75.1%$ top-1 accuracy on the ImageNet dataset, which is higher than that of state-of-the-art methods using the same search space.
arXiv Detail & Related papers (2020-05-29T09:02:16Z) - Fitting the Search Space of Weight-sharing NAS with Graph Convolutional
Networks [100.14670789581811]
We train a graph convolutional network to fit the performance of sampled sub-networks.
With this strategy, we achieve a higher rank correlation coefficient in the selected set of candidates.
arXiv Detail & Related papers (2020-04-17T19:12:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.