RMNet: Equivalently Removing Residual Connection from Networks
- URL: http://arxiv.org/abs/2111.00687v1
- Date: Mon, 1 Nov 2021 04:07:45 GMT
- Title: RMNet: Equivalently Removing Residual Connection from Networks
- Authors: Fanxu Meng, Hao Cheng, Jiaxin Zhuang, Ke Li, Xing Sun
- Abstract summary: We propose to remove the residual connection in a vanilla ResNet equivalently by a reserving and merging (RM) operation on ResBlock.
As a plug-in method, RM Operation basically has three advantages: 1) its implementation makes it naturally friendly for high ratio network pruning, 2) it helps break the depth limitation of RepVGG, and 3) it leads to better accuracy-speed trade-off network (RMNet) compared to ResNet and RepVGG.
- Score: 15.32653042487324
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although residual connection enables training very deep neural networks, it
is not friendly for online inference due to its multi-branch topology. This
encourages many researchers to work on designing DNNs without residual
connections at inference. For example, RepVGG re-parameterizes multi-branch
topology to a VGG-like (single-branch) model when deploying, showing great
performance when the network is relatively shallow. However, RepVGG can not
transform ResNet to VGG equivalently because re-parameterizing methods can only
be applied to linear blocks and the non-linear layers (ReLU) have to be put
outside of the residual connection which results in limited representation
ability, especially for deeper networks. In this paper, we aim to remedy this
problem and propose to remove the residual connection in a vanilla ResNet
equivalently by a reserving and merging (RM) operation on ResBlock.
Specifically, the RM operation allows input feature maps to pass through the
block while reserving their information and merges all the information at the
end of each block, which can remove residual connections without changing the
original output. As a plug-in method, RM Operation basically has three
advantages: 1) its implementation makes it naturally friendly for high ratio
network pruning. 2) it helps break the depth limitation of RepVGG. 3) it leads
to better accuracy-speed trade-off network (RMNet) compared to ResNet and
RepVGG. We believe the ideology of RM Operation can inspire many insights on
model design for the community in the future. Code is available at:
https://github.com/fxmeng/RMNet.
Related papers
- Securing Neural Networks with Knapsack Optimization [12.998637003026273]
In this paper, we focus on ResNets, which serve as the backbone for many Computer Vision tasks.
We aim to reduce their non-linear components, specifically, the number of ReLUs.
We devise an algorithm to choose the optimal set of patch sizes through a novel reduction of the problem to the Knapsack Problem.
arXiv Detail & Related papers (2023-04-20T16:40:10Z) - Re^2TAL: Rewiring Pretrained Video Backbones for Reversible Temporal
Action Localization [65.33914980022303]
Temporal action localization (TAL) requires long-form reasoning to predict actions of various durations and complex content.
Most methods can only train on pre-extracted features without optimizing them for the localization problem.
We propose a novel end-to-end method Re2TAL, which rewires pretrained video backbones for reversible TAL.
arXiv Detail & Related papers (2022-11-25T12:17:30Z) - Deep Learning without Shortcuts: Shaping the Kernel with Tailored
Rectifiers [83.74380713308605]
We develop a new type of transformation that is fully compatible with a variant of ReLUs -- Leaky ReLUs.
We show in experiments that our method, which introduces negligible extra computational cost, validation accuracies with deep vanilla networks that are competitive with ResNets.
arXiv Detail & Related papers (2022-03-15T17:49:08Z) - Hidden-Fold Networks: Random Recurrent Residuals Using Sparse Supermasks [1.0814638303152528]
Deep neural networks (DNNs) are so over-parametrized that recent research has found them to contain a subnetwork with high accuracy.
This paper proposes blending these lines of research into a highly compressed yet accurate model: Hidden-Fold Networks (HFNs)
It achieves equivalent performance to ResNet50 on CIFAR100 while occupying 38.5x less memory, and similar performance to ResNet34 on ImageNet with a memory size 26.8x smaller.
arXiv Detail & Related papers (2021-11-24T08:24:31Z) - Edge Rewiring Goes Neural: Boosting Network Resilience via Policy
Gradient [62.660451283548724]
ResiNet is a reinforcement learning framework to discover resilient network topologies against various disasters and attacks.
We show that ResiNet achieves a near-optimal resilience gain on multiple graphs while balancing the utility, with a large margin compared to existing approaches.
arXiv Detail & Related papers (2021-10-18T06:14:28Z) - Invertible Residual Network with Regularization for Effective Medical
Image Segmentation [2.76240219662896]
Invertible neural networks have been applied to significantly reduce activation memory footprint when training neural networks with backpropagation.
We propose two versions of the invertible Residual Network, namely Partially Invertible Residual Network (Partially-InvRes) and Fully Invertible Residual Network (Fully-InvRes)
Our results indicate that by using partially/fully invertible networks as the central workhorse in volumetric segmentation, we not only reduce memory overhead but also achieve compatible segmentation performance compared against the non-invertible 3D Unet.
arXiv Detail & Related papers (2021-03-16T13:19:59Z) - RepVGG: Making VGG-style ConvNets Great Again [116.0327370719692]
We present a simple but powerful architecture of convolutional neural network, which has a VGG-like inference-time body composed of nothing but a stack of 3x3 convolution and ReLU.
RepVGG reaches over 80% top-1 accuracy, which is the first time for a plain model, to the best of our knowledge.
arXiv Detail & Related papers (2021-01-11T04:46:11Z) - Replay and Synthetic Speech Detection with Res2net Architecture [85.20912636149552]
Existing approaches for replay and synthetic speech detection still lack generalizability to unseen spoofing attacks.
This work proposes to leverage a novel model structure, so-called Res2Net, to improve the anti-spoofing countermeasure's generalizability.
arXiv Detail & Related papers (2020-10-28T14:33:42Z) - Efficient Integer-Arithmetic-Only Convolutional Neural Networks [87.01739569518513]
We replace conventional ReLU with Bounded ReLU and find that the decline is due to activation quantization.
Our integer networks achieve equivalent performance as the corresponding FPN networks, but have only 1/4 memory cost and run 2x faster on modern GPU.
arXiv Detail & Related papers (2020-06-21T08:23:03Z) - Pruning CNN's with linear filter ensembles [0.0]
We use pruning to reduce the network size and -- implicitly -- the number of floating point operations (FLOPs)
We develop a novel filter importance norm that is based on the change in the empirical loss caused by the presence or removal of a component from the network architecture.
We evaluate our method on a fully connected network, as well as on the ResNet architecture trained on the CIFAR-10 dataset.
arXiv Detail & Related papers (2020-01-22T16:52:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.