Related papers: RMNet: Equivalently Removing Residual Connection from Networks

RMNet: Equivalently Removing Residual Connection from Networks

URL: http://arxiv.org/abs/2111.00687v1
Date: Mon, 1 Nov 2021 04:07:45 GMT
Title: RMNet: Equivalently Removing Residual Connection from Networks
Authors: Fanxu Meng, Hao Cheng, Jiaxin Zhuang, Ke Li, Xing Sun
Abstract summary: We propose to remove the residual connection in a vanilla ResNet equivalently by a reserving and merging (RM) operation on ResBlock. As a plug-in method, RM Operation basically has three advantages: 1) its implementation makes it naturally friendly for high ratio network pruning, 2) it helps break the depth limitation of RepVGG, and 3) it leads to better accuracy-speed trade-off network (RMNet) compared to ResNet and RepVGG.
Score: 15.32653042487324
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Although residual connection enables training very deep neural networks, it is not friendly for online inference due to its multi-branch topology. This encourages many researchers to work on designing DNNs without residual connections at inference. For example, RepVGG re-parameterizes multi-branch topology to a VGG-like (single-branch) model when deploying, showing great performance when the network is relatively shallow. However, RepVGG can not transform ResNet to VGG equivalently because re-parameterizing methods can only be applied to linear blocks and the non-linear layers (ReLU) have to be put outside of the residual connection which results in limited representation ability, especially for deeper networks. In this paper, we aim to remedy this problem and propose to remove the residual connection in a vanilla ResNet equivalently by a reserving and merging (RM) operation on ResBlock. Specifically, the RM operation allows input feature maps to pass through the block while reserving their information and merges all the information at the end of each block, which can remove residual connections without changing the original output. As a plug-in method, RM Operation basically has three advantages: 1) its implementation makes it naturally friendly for high ratio network pruning. 2) it helps break the depth limitation of RepVGG. 3) it leads to better accuracy-speed trade-off network (RMNet) compared to ResNet and RepVGG. We believe the ideology of RM Operation can inspire many insights on model design for the community in the future. Code is available at: https://github.com/fxmeng/RMNet.

Related papers

Securing Neural Networks with Knapsack Optimization [12.998637003026273]
In this paper, we focus on ResNets, which serve as the backbone for many Computer Vision tasks. We aim to reduce their non-linear components, specifically, the number of ReLUs. We devise an algorithm to choose the optimal set of patch sizes through a novel reduction of the problem to the Knapsack Problem.
arXiv Detail & Related papers (2023-04-20T16:40:10Z)
Re^2TAL: Rewiring Pretrained Video Backbones for Reversible Temporal Action Localization [65.33914980022303]
Temporal action localization (TAL) requires long-form reasoning to predict actions of various durations and complex content. Most methods can only train on pre-extracted features without optimizing them for the localization problem. We propose a novel end-to-end method Re2TAL, which rewires pretrained video backbones for reversible TAL.
arXiv Detail & Related papers (2022-11-25T12:17:30Z)
Deep Learning without Shortcuts: Shaping the Kernel with Tailored Rectifiers [83.74380713308605]
We develop a new type of transformation that is fully compatible with a variant of ReLUs -- Leaky ReLUs. We show in experiments that our method, which introduces negligible extra computational cost, validation accuracies with deep vanilla networks that are competitive with ResNets.
arXiv Detail & Related papers (2022-03-15T17:49:08Z)
Hidden-Fold Networks: Random Recurrent Residuals Using Sparse Supermasks [1.0814638303152528]
Deep neural networks (DNNs) are so over-parametrized that recent research has found them to contain a subnetwork with high accuracy. This paper proposes blending these lines of research into a highly compressed yet accurate model: Hidden-Fold Networks (HFNs) It achieves equivalent performance to ResNet50 on CIFAR100 while occupying 38.5x less memory, and similar performance to ResNet34 on ImageNet with a memory size 26.8x smaller.
arXiv Detail & Related papers (2021-11-24T08:24:31Z)
Edge Rewiring Goes Neural: Boosting Network Resilience via Policy Gradient [62.660451283548724]
ResiNet is a reinforcement learning framework to discover resilient network topologies against various disasters and attacks. We show that ResiNet achieves a near-optimal resilience gain on multiple graphs while balancing the utility, with a large margin compared to existing approaches.
arXiv Detail & Related papers (2021-10-18T06:14:28Z)
Invertible Residual Network with Regularization for Effective Medical Image Segmentation [2.76240219662896]
Invertible neural networks have been applied to significantly reduce activation memory footprint when training neural networks with backpropagation. We propose two versions of the invertible Residual Network, namely Partially Invertible Residual Network (Partially-InvRes) and Fully Invertible Residual Network (Fully-InvRes) Our results indicate that by using partially/fully invertible networks as the central workhorse in volumetric segmentation, we not only reduce memory overhead but also achieve compatible segmentation performance compared against the non-invertible 3D Unet.
arXiv Detail & Related papers (2021-03-16T13:19:59Z)
RepVGG: Making VGG-style ConvNets Great Again [116.0327370719692]
We present a simple but powerful architecture of convolutional neural network, which has a VGG-like inference-time body composed of nothing but a stack of 3x3 convolution and ReLU. RepVGG reaches over 80% top-1 accuracy, which is the first time for a plain model, to the best of our knowledge.
arXiv Detail & Related papers (2021-01-11T04:46:11Z)
Replay and Synthetic Speech Detection with Res2net Architecture [85.20912636149552]
Existing approaches for replay and synthetic speech detection still lack generalizability to unseen spoofing attacks. This work proposes to leverage a novel model structure, so-called Res2Net, to improve the anti-spoofing countermeasure's generalizability.
arXiv Detail & Related papers (2020-10-28T14:33:42Z)
Efficient Integer-Arithmetic-Only Convolutional Neural Networks [87.01739569518513]
We replace conventional ReLU with Bounded ReLU and find that the decline is due to activation quantization. Our integer networks achieve equivalent performance as the corresponding FPN networks, but have only 1/4 memory cost and run 2x faster on modern GPU.
arXiv Detail & Related papers (2020-06-21T08:23:03Z)
Pruning CNN's with linear filter ensembles [0.0]
We use pruning to reduce the network size and -- implicitly -- the number of floating point operations (FLOPs) We develop a novel filter importance norm that is based on the change in the empirical loss caused by the presence or removal of a component from the network architecture. We evaluate our method on a fully connected network, as well as on the ResNet architecture trained on the CIFAR-10 dataset.
arXiv Detail & Related papers (2020-01-22T16:52:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.