Related papers: Self-Reorganizing and Rejuvenating CNNs for Increasing Model Capacity Utilization

Self-Reorganizing and Rejuvenating CNNs for Increasing Model Capacity Utilization

URL: http://arxiv.org/abs/2102.06870v1
Date: Sat, 13 Feb 2021 06:19:45 GMT
Title: Self-Reorganizing and Rejuvenating CNNs for Increasing Model Capacity Utilization
Authors: Wissam J. Baddar, Seungju Han, Seonmin Rhee, Jae-Joon Han
Abstract summary: We propose a biologically inspired method for improving the computational resource utilization of neural networks. The proposed method utilizes the channel activations of a convolution layer in order to reorganize that layers parameters. The rejuvenated parameters learn different features to supplement those learned by the reorganized surviving parameters.
Score: 8.661269034961679
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this paper, we propose self-reorganizing and rejuvenating convolutional neural networks; a biologically inspired method for improving the computational resource utilization of neural networks. The proposed method utilizes the channel activations of a convolution layer in order to reorganize that layers parameters. The reorganized parameters are clustered to avoid parameter redundancies. As such, redundant neurons with similar activations are merged leaving room for the remaining parameters to rejuvenate. The rejuvenated parameters learn different features to supplement those learned by the reorganized surviving parameters. As a result, the network capacity utilization increases improving the baseline network performance without any changes to the network structure. The proposed method can be applied to various network architectures during the training stage, or applied to a pre-trained model improving its performance. Experimental results showed that the proposed method is model-agnostic and can be applied to any backbone architecture increasing its performance due to the elevated utilization of the network capacity.

Related papers

Towards Efficient Deep Spiking Neural Networks Construction with Spiking Activity based Pruning [17.454100169491497]
We propose a structured pruning approach based on the activity levels of convolutional kernels named Spiking Channel Activity-based (SCA) network pruning framework. Inspired by synaptic plasticity mechanisms, our method dynamically adjusts the network's structure by pruning and regenerating convolutional kernels during training, enhancing the model's adaptation to the current target task.
arXiv Detail & Related papers (2024-06-03T07:44:37Z)
Power-Enhanced Residual Network for Function Approximation and Physics-Informed Inverse Problems [0.0]
This paper introduces a novel neural network structure called the Power-Enhancing residual network. It improves the network's capabilities for both smooth and non-smooth functions approximation in 2D and 3D settings. Results emphasize the exceptional accuracy of the proposed Power-Enhancing residual network, particularly for non-smooth functions.
arXiv Detail & Related papers (2023-10-24T10:01:15Z)
Reparameterization through Spatial Gradient Scaling [69.27487006953852]
Reparameterization aims to improve the generalization of deep neural networks by transforming convolutional layers into equivalent multi-branched structures during training. We present a novel spatial gradient scaling method to redistribute learning focus among weights in convolutional networks.
arXiv Detail & Related papers (2023-03-05T17:57:33Z)
A Generic Shared Attention Mechanism for Various Backbone Neural Networks [53.36677373145012]
Self-attention modules (SAMs) produce strongly correlated attention maps across different layers. Dense-and-Implicit Attention (DIA) shares SAMs across layers and employs a long short-term memory module. Our simple yet effective DIA can consistently enhance various network backbones.
arXiv Detail & Related papers (2022-10-27T13:24:08Z)
Learning to Learn with Generative Models of Neural Network Checkpoints [71.06722933442956]
We construct a dataset of neural network checkpoints and train a generative model on the parameters. We find that our approach successfully generates parameters for a wide range of loss prompts. We apply our method to different neural network architectures and tasks in supervised and reinforcement learning.
arXiv Detail & Related papers (2022-09-26T17:59:58Z)
Clustering-Based Interpretation of Deep ReLU Network [17.234442722611803]
We recognize that the non-linear behavior of the ReLU function gives rise to a natural clustering. We propose a method to increase the level of interpretability of a fully connected feedforward ReLU neural network.
arXiv Detail & Related papers (2021-10-13T09:24:11Z)
Local Critic Training for Model-Parallel Learning of Deep Neural Networks [94.69202357137452]
We propose a novel model-parallel learning method, called local critic training. We show that the proposed approach successfully decouples the update process of the layer groups for both convolutional neural networks (CNNs) and recurrent neural networks (RNNs) We also show that trained networks by the proposed method can be used for structural optimization.
arXiv Detail & Related papers (2021-02-03T09:30:45Z)
Modeling from Features: a Mean-field Framework for Over-parameterized Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs) In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit. We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z)
Neural Parameter Allocation Search [57.190693718951316]
Training neural networks requires increasing amounts of memory. Existing methods assume networks have many identical layers and utilize hand-crafted sharing strategies that fail to generalize. We introduce Neural Allocation Search (NPAS), a novel task where the goal is to train a neural network given an arbitrary, fixed parameter budget. NPAS covers both low-budget regimes, which produce compact networks, as well as a novel high-budget regime, where additional capacity can be added to boost performance without increasing inference FLOPs.
arXiv Detail & Related papers (2020-06-18T15:01:00Z)
Lifted Regression/Reconstruction Networks [17.89437720094451]
We propose lifted regression/reconstruction networks (LRRNs) LRRNs combine lifted neural networks with a guaranteed Lipschitz continuity property for the output layer. We analyse and numerically demonstrate applications for unsupervised and supervised learning.
arXiv Detail & Related papers (2020-05-07T13:24:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.