Designing Network Design Strategies Through Gradient Path Analysis
- URL: http://arxiv.org/abs/2211.04800v1
- Date: Wed, 9 Nov 2022 10:51:57 GMT
- Title: Designing Network Design Strategies Through Gradient Path Analysis
- Authors: Chien-Yao Wang, Hong-Yuan Mark Liao, I-Hau Yeh
- Abstract summary: This paper proposes a new network design strategy, i.e., to design the network architecture based on gradient path analysis.
We propose the gradient path design strategies for the layer-level, the stage-level, and the network-level.
- Score: 12.90962626557934
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Designing a high-efficiency and high-quality expressive network architecture
has always been the most important research topic in the field of deep
learning. Most of today's network design strategies focus on how to integrate
features extracted from different layers, and how to design computing units to
effectively extract these features, thereby enhancing the expressiveness of the
network. This paper proposes a new network design strategy, i.e., to design the
network architecture based on gradient path analysis. On the whole, most of
today's mainstream network design strategies are based on feed forward path,
that is, the network architecture is designed based on the data path. In this
paper, we hope to enhance the expressive ability of the trained model by
improving the network learning ability. Due to the mechanism driving the
network parameter learning is the backward propagation algorithm, we design
network design strategies based on back propagation path. We propose the
gradient path design strategies for the layer-level, the stage-level, and the
network-level, and the design strategies are proved to be superior and feasible
from theoretical analysis and experiments.
Related papers
- Principled Architecture-aware Scaling of Hyperparameters [69.98414153320894]
Training a high-quality deep neural network requires choosing suitable hyperparameters, which is a non-trivial and expensive process.
In this work, we precisely characterize the dependence of initializations and maximal learning rates on the network architecture.
We demonstrate that network rankings can be easily changed by better training networks in benchmarks.
arXiv Detail & Related papers (2024-02-27T11:52:49Z) - Reparameterization through Spatial Gradient Scaling [69.27487006953852]
Reparameterization aims to improve the generalization of deep neural networks by transforming convolutional layers into equivalent multi-branched structures during training.
We present a novel spatial gradient scaling method to redistribute learning focus among weights in convolutional networks.
arXiv Detail & Related papers (2023-03-05T17:57:33Z) - Hysteretic Behavior Simulation Based on Pyramid Neural
Network:Principle, Network Architecture, Case Study and Explanation [0.0]
A surrogate model based on neural networks shows significant potential in balancing efficiency and accuracy.
Its serial information flow and prediction based on single-level features adversely affect the network performance.
A weighted stacked pyramid neural network architecture is proposed herein.
arXiv Detail & Related papers (2022-04-29T16:42:00Z) - Neural Architecture Search for Speech Emotion Recognition [72.1966266171951]
We propose to apply neural architecture search (NAS) techniques to automatically configure the SER models.
We show that NAS can improve SER performance (54.89% to 56.28%) while maintaining model parameter sizes.
arXiv Detail & Related papers (2022-03-31T10:16:10Z) - SIRe-Networks: Skip Connections over Interlaced Multi-Task Learning and
Residual Connections for Structure Preserving Object Classification [28.02302915971059]
In this paper, we introduce an interlaced multi-task learning strategy, defined SIRe, to reduce the vanishing gradient in relation to the object classification task.
The presented methodology directly improves a convolutional neural network (CNN) by enforcing the input image structure preservation through auto-encoders.
To validate the presented methodology, a simple CNN and various implementations of famous networks are extended via the SIRe strategy and extensively tested on the CIFAR100 dataset.
arXiv Detail & Related papers (2021-10-06T13:54:49Z) - Analyze and Design Network Architectures by Recursion Formulas [4.085771561472743]
This work attempts to find an effective way to design new network architectures.
It is discovered that the main difference between network architectures can be reflected in their formulas.
A case study is provided to generate an improved architecture based on ResNet.
Massive experiments are conducted on CIFAR and ImageNet, which witness the significant performance improvements.
arXiv Detail & Related papers (2021-08-18T06:53:30Z) - Dynamically Grown Generative Adversarial Networks [111.43128389995341]
We propose a method to dynamically grow a GAN during training, optimizing the network architecture and its parameters together with automation.
The method embeds architecture search techniques as an interleaving step with gradient-based training to periodically seek the optimal architecture-growing strategy for the generator and discriminator.
arXiv Detail & Related papers (2021-06-16T01:25:51Z) - Firefly Neural Architecture Descent: a General Approach for Growing
Neural Networks [50.684661759340145]
Firefly neural architecture descent is a general framework for progressively and dynamically growing neural networks.
We show that firefly descent can flexibly grow networks both wider and deeper, and can be applied to learn accurate but resource-efficient neural architectures.
In particular, it learns networks that are smaller in size but have higher average accuracy than those learned by the state-of-the-art methods.
arXiv Detail & Related papers (2021-02-17T04:47:18Z) - FactorizeNet: Progressive Depth Factorization for Efficient Network
Architecture Exploration Under Quantization Constraints [93.4221402881609]
We introduce a progressive depth factorization strategy for efficient CNN architecture exploration under quantization constraints.
By algorithmically increasing the granularity of depth factorization in a progressive manner, the proposed strategy enables a fine-grained, low-level analysis of layer-wise distributions.
Such a progressive depth factorization strategy also enables efficient identification of the optimal depth-factorized macroarchitecture design.
arXiv Detail & Related papers (2020-11-30T07:12:26Z) - Ring Reservoir Neural Networks for Graphs [15.07984894938396]
Reservoir Computing models can play an important role in developing fruitful graph embeddings.
Our core proposal is based on shaping the organization of the hidden neurons to follow a ring topology.
Experimental results on graph classification tasks indicate that ring-reservoirs architectures enable particularly effective network configurations.
arXiv Detail & Related papers (2020-05-11T17:51:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.