Related papers: Designing Network Design Spaces

Designing Network Design Spaces

URL: http://arxiv.org/abs/2003.13678v1
Date: Mon, 30 Mar 2020 17:57:47 GMT
Title: Designing Network Design Spaces
Authors: Ilija Radosavovic, Raj Prateek Kosaraju, Ross Girshick, Kaiming He, Piotr Doll\'ar
Abstract summary: Instead of focusing on designing individual network instances, we design network design spaces that parametrize populations of networks. Using our methodology we explore the structure aspect of network design and arrive at a low-dimensional design space consisting of simple, regular networks. We analyze the RegNet design space and arrive at interesting findings that do not match the current practice of network design.
Score: 33.616649851247416
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this work, we present a new network design paradigm. Our goal is to help advance the understanding of network design and discover design principles that generalize across settings. Instead of focusing on designing individual network instances, we design network design spaces that parametrize populations of networks. The overall process is analogous to classic manual design of networks, but elevated to the design space level. Using our methodology we explore the structure aspect of network design and arrive at a low-dimensional design space consisting of simple, regular networks that we call RegNet. The core insight of the RegNet parametrization is surprisingly simple: widths and depths of good networks can be explained by a quantized linear function. We analyze the RegNet design space and arrive at interesting findings that do not match the current practice of network design. The RegNet design space provides simple and fast networks that work well across a wide range of flop regimes. Under comparable training settings and flops, the RegNet models outperform the popular EfficientNet models while being up to 5x faster on GPUs.

Related papers

GeNet: A Multimodal LLM-Based Co-Pilot for Network Topology and Configuration [21.224554993149184]
GeNet is a novel framework that leverages a large language model (LLM) to streamline network design. It uses visual and textual modalities to interpret and update network topologies and device configurations based on user intents.
arXiv Detail & Related papers (2024-07-11T07:51:57Z)
Principled Architecture-aware Scaling of Hyperparameters [69.98414153320894]
Training a high-quality deep neural network requires choosing suitable hyperparameters, which is a non-trivial and expensive process. In this work, we precisely characterize the dependence of initializations and maximal learning rates on the network architecture. We demonstrate that network rankings can be easily changed by better training networks in benchmarks.
arXiv Detail & Related papers (2024-02-27T11:52:49Z)
Designing Network Design Strategies Through Gradient Path Analysis [12.90962626557934]
This paper proposes a new network design strategy, i.e., to design the network architecture based on gradient path analysis. We propose the gradient path design strategies for the layer-level, the stage-level, and the network-level.
arXiv Detail & Related papers (2022-11-09T10:51:57Z)
Firefly Neural Architecture Descent: a General Approach for Growing Neural Networks [50.684661759340145]
Firefly neural architecture descent is a general framework for progressively and dynamically growing neural networks. We show that firefly descent can flexibly grow networks both wider and deeper, and can be applied to learn accurate but resource-efficient neural architectures. In particular, it learns networks that are smaller in size but have higher average accuracy than those learned by the state-of-the-art methods.
arXiv Detail & Related papers (2021-02-17T04:47:18Z)
Scaling Wide Residual Networks for Panoptic Segmentation [29.303735643858026]
Wide Residual Networks (Wide-ResNets) are a shallow but wide model variant of the Residual Networks (ResNets) We revisit its architecture design for the recent challenging panoptic segmentation task, which aims to unify semantic segmentation and instance segmentation. We demonstrate that such a simple scaling scheme, coupled with grid search, identifies several SWideRNets that significantly advance state-of-the-art performance on panoptic segmentation datasets in both the fast model regime and strong model regime.
arXiv Detail & Related papers (2020-11-23T19:14:11Z)
Dynamic Graph: Learning Instance-aware Connectivity for Neural Networks [78.65792427542672]
Dynamic Graph Network (DG-Net) is a complete directed acyclic graph, where the nodes represent convolutional blocks and the edges represent connection paths. Instead of using the same path of the network, DG-Net aggregates features dynamically in each node, which allows the network to have more representation ability.
arXiv Detail & Related papers (2020-10-02T16:50:26Z)
Orthogonalized SGD and Nested Architectures for Anytime Neural Networks [30.598394152055338]
Orthogonalized SGD dynamically re-balances task-specific gradients when training a multitask network. Experiments demonstrate that training with Orthogonalized SGD significantly improves accuracy of anytime networks.
arXiv Detail & Related papers (2020-08-15T03:06:34Z)
The Heterogeneity Hypothesis: Finding Layer-Wise Differentiated Network Architectures [179.66117325866585]
We investigate a design space that is usually overlooked, i.e. adjusting the channel configurations of predefined networks. We find that this adjustment can be achieved by shrinking widened baseline networks and leads to superior performance. Experiments are conducted on various networks and datasets for image classification, visual tracking and image restoration.
arXiv Detail & Related papers (2020-06-29T17:59:26Z)
Network Adjustment: Channel Search Guided by FLOPs Utilization Ratio [101.84651388520584]
This paper presents a new framework named network adjustment, which considers network accuracy as a function of FLOPs. Experiments on standard image classification datasets and a wide range of base networks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-04-06T15:51:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.