Designing Network Design Spaces
- URL: http://arxiv.org/abs/2003.13678v1
- Date: Mon, 30 Mar 2020 17:57:47 GMT
- Title: Designing Network Design Spaces
- Authors: Ilija Radosavovic, Raj Prateek Kosaraju, Ross Girshick, Kaiming He,
Piotr Doll\'ar
- Abstract summary: Instead of focusing on designing individual network instances, we design network design spaces that parametrize populations of networks.
Using our methodology we explore the structure aspect of network design and arrive at a low-dimensional design space consisting of simple, regular networks.
We analyze the RegNet design space and arrive at interesting findings that do not match the current practice of network design.
- Score: 33.616649851247416
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work, we present a new network design paradigm. Our goal is to help
advance the understanding of network design and discover design principles that
generalize across settings. Instead of focusing on designing individual network
instances, we design network design spaces that parametrize populations of
networks. The overall process is analogous to classic manual design of
networks, but elevated to the design space level. Using our methodology we
explore the structure aspect of network design and arrive at a low-dimensional
design space consisting of simple, regular networks that we call RegNet. The
core insight of the RegNet parametrization is surprisingly simple: widths and
depths of good networks can be explained by a quantized linear function. We
analyze the RegNet design space and arrive at interesting findings that do not
match the current practice of network design. The RegNet design space provides
simple and fast networks that work well across a wide range of flop regimes.
Under comparable training settings and flops, the RegNet models outperform the
popular EfficientNet models while being up to 5x faster on GPUs.
Related papers
- GeNet: A Multimodal LLM-Based Co-Pilot for Network Topology and Configuration [21.224554993149184]
GeNet is a novel framework that leverages a large language model (LLM) to streamline network design.
It uses visual and textual modalities to interpret and update network topologies and device configurations based on user intents.
arXiv Detail & Related papers (2024-07-11T07:51:57Z) - Principled Architecture-aware Scaling of Hyperparameters [69.98414153320894]
Training a high-quality deep neural network requires choosing suitable hyperparameters, which is a non-trivial and expensive process.
In this work, we precisely characterize the dependence of initializations and maximal learning rates on the network architecture.
We demonstrate that network rankings can be easily changed by better training networks in benchmarks.
arXiv Detail & Related papers (2024-02-27T11:52:49Z) - Designing Network Design Strategies Through Gradient Path Analysis [12.90962626557934]
This paper proposes a new network design strategy, i.e., to design the network architecture based on gradient path analysis.
We propose the gradient path design strategies for the layer-level, the stage-level, and the network-level.
arXiv Detail & Related papers (2022-11-09T10:51:57Z) - Firefly Neural Architecture Descent: a General Approach for Growing
Neural Networks [50.684661759340145]
Firefly neural architecture descent is a general framework for progressively and dynamically growing neural networks.
We show that firefly descent can flexibly grow networks both wider and deeper, and can be applied to learn accurate but resource-efficient neural architectures.
In particular, it learns networks that are smaller in size but have higher average accuracy than those learned by the state-of-the-art methods.
arXiv Detail & Related papers (2021-02-17T04:47:18Z) - Scaling Wide Residual Networks for Panoptic Segmentation [29.303735643858026]
Wide Residual Networks (Wide-ResNets) are a shallow but wide model variant of the Residual Networks (ResNets)
We revisit its architecture design for the recent challenging panoptic segmentation task, which aims to unify semantic segmentation and instance segmentation.
We demonstrate that such a simple scaling scheme, coupled with grid search, identifies several SWideRNets that significantly advance state-of-the-art performance on panoptic segmentation datasets in both the fast model regime and strong model regime.
arXiv Detail & Related papers (2020-11-23T19:14:11Z) - Dynamic Graph: Learning Instance-aware Connectivity for Neural Networks [78.65792427542672]
Dynamic Graph Network (DG-Net) is a complete directed acyclic graph, where the nodes represent convolutional blocks and the edges represent connection paths.
Instead of using the same path of the network, DG-Net aggregates features dynamically in each node, which allows the network to have more representation ability.
arXiv Detail & Related papers (2020-10-02T16:50:26Z) - Orthogonalized SGD and Nested Architectures for Anytime Neural Networks [30.598394152055338]
Orthogonalized SGD dynamically re-balances task-specific gradients when training a multitask network.
Experiments demonstrate that training with Orthogonalized SGD significantly improves accuracy of anytime networks.
arXiv Detail & Related papers (2020-08-15T03:06:34Z) - The Heterogeneity Hypothesis: Finding Layer-Wise Differentiated Network
Architectures [179.66117325866585]
We investigate a design space that is usually overlooked, i.e. adjusting the channel configurations of predefined networks.
We find that this adjustment can be achieved by shrinking widened baseline networks and leads to superior performance.
Experiments are conducted on various networks and datasets for image classification, visual tracking and image restoration.
arXiv Detail & Related papers (2020-06-29T17:59:26Z) - Network Adjustment: Channel Search Guided by FLOPs Utilization Ratio [101.84651388520584]
This paper presents a new framework named network adjustment, which considers network accuracy as a function of FLOPs.
Experiments on standard image classification datasets and a wide range of base networks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-04-06T15:51:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.