Self-Reorganizing and Rejuvenating CNNs for Increasing Model Capacity
Utilization
- URL: http://arxiv.org/abs/2102.06870v1
- Date: Sat, 13 Feb 2021 06:19:45 GMT
- Title: Self-Reorganizing and Rejuvenating CNNs for Increasing Model Capacity
Utilization
- Authors: Wissam J. Baddar, Seungju Han, Seonmin Rhee, Jae-Joon Han
- Abstract summary: We propose a biologically inspired method for improving the computational resource utilization of neural networks.
The proposed method utilizes the channel activations of a convolution layer in order to reorganize that layers parameters.
The rejuvenated parameters learn different features to supplement those learned by the reorganized surviving parameters.
- Score: 8.661269034961679
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we propose self-reorganizing and rejuvenating convolutional
neural networks; a biologically inspired method for improving the computational
resource utilization of neural networks. The proposed method utilizes the
channel activations of a convolution layer in order to reorganize that layers
parameters. The reorganized parameters are clustered to avoid parameter
redundancies. As such, redundant neurons with similar activations are merged
leaving room for the remaining parameters to rejuvenate. The rejuvenated
parameters learn different features to supplement those learned by the
reorganized surviving parameters. As a result, the network capacity utilization
increases improving the baseline network performance without any changes to the
network structure. The proposed method can be applied to various network
architectures during the training stage, or applied to a pre-trained model
improving its performance. Experimental results showed that the proposed method
is model-agnostic and can be applied to any backbone architecture increasing
its performance due to the elevated utilization of the network capacity.
Related papers
- Towards Efficient Deep Spiking Neural Networks Construction with Spiking Activity based Pruning [17.454100169491497]
We propose a structured pruning approach based on the activity levels of convolutional kernels named Spiking Channel Activity-based (SCA) network pruning framework.
Inspired by synaptic plasticity mechanisms, our method dynamically adjusts the network's structure by pruning and regenerating convolutional kernels during training, enhancing the model's adaptation to the current target task.
arXiv Detail & Related papers (2024-06-03T07:44:37Z) - Power-Enhanced Residual Network for Function Approximation and Physics-Informed Inverse Problems [0.0]
This paper introduces a novel neural network structure called the Power-Enhancing residual network.
It improves the network's capabilities for both smooth and non-smooth functions approximation in 2D and 3D settings.
Results emphasize the exceptional accuracy of the proposed Power-Enhancing residual network, particularly for non-smooth functions.
arXiv Detail & Related papers (2023-10-24T10:01:15Z) - Reparameterization through Spatial Gradient Scaling [69.27487006953852]
Reparameterization aims to improve the generalization of deep neural networks by transforming convolutional layers into equivalent multi-branched structures during training.
We present a novel spatial gradient scaling method to redistribute learning focus among weights in convolutional networks.
arXiv Detail & Related papers (2023-03-05T17:57:33Z) - A Generic Shared Attention Mechanism for Various Backbone Neural Networks [53.36677373145012]
Self-attention modules (SAMs) produce strongly correlated attention maps across different layers.
Dense-and-Implicit Attention (DIA) shares SAMs across layers and employs a long short-term memory module.
Our simple yet effective DIA can consistently enhance various network backbones.
arXiv Detail & Related papers (2022-10-27T13:24:08Z) - Learning to Learn with Generative Models of Neural Network Checkpoints [71.06722933442956]
We construct a dataset of neural network checkpoints and train a generative model on the parameters.
We find that our approach successfully generates parameters for a wide range of loss prompts.
We apply our method to different neural network architectures and tasks in supervised and reinforcement learning.
arXiv Detail & Related papers (2022-09-26T17:59:58Z) - Clustering-Based Interpretation of Deep ReLU Network [17.234442722611803]
We recognize that the non-linear behavior of the ReLU function gives rise to a natural clustering.
We propose a method to increase the level of interpretability of a fully connected feedforward ReLU neural network.
arXiv Detail & Related papers (2021-10-13T09:24:11Z) - Local Critic Training for Model-Parallel Learning of Deep Neural
Networks [94.69202357137452]
We propose a novel model-parallel learning method, called local critic training.
We show that the proposed approach successfully decouples the update process of the layer groups for both convolutional neural networks (CNNs) and recurrent neural networks (RNNs)
We also show that trained networks by the proposed method can be used for structural optimization.
arXiv Detail & Related papers (2021-02-03T09:30:45Z) - Modeling from Features: a Mean-field Framework for Over-parameterized
Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs)
In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit.
We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z) - Neural Parameter Allocation Search [57.190693718951316]
Training neural networks requires increasing amounts of memory.
Existing methods assume networks have many identical layers and utilize hand-crafted sharing strategies that fail to generalize.
We introduce Neural Allocation Search (NPAS), a novel task where the goal is to train a neural network given an arbitrary, fixed parameter budget.
NPAS covers both low-budget regimes, which produce compact networks, as well as a novel high-budget regime, where additional capacity can be added to boost performance without increasing inference FLOPs.
arXiv Detail & Related papers (2020-06-18T15:01:00Z) - Lifted Regression/Reconstruction Networks [17.89437720094451]
We propose lifted regression/reconstruction networks (LRRNs)
LRRNs combine lifted neural networks with a guaranteed Lipschitz continuity property for the output layer.
We analyse and numerically demonstrate applications for unsupervised and supervised learning.
arXiv Detail & Related papers (2020-05-07T13:24:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.