m-RevNet: Deep Reversible Neural Networks with Momentum
- URL: http://arxiv.org/abs/2108.05862v2
- Date: Mon, 16 Aug 2021 13:04:04 GMT
- Title: m-RevNet: Deep Reversible Neural Networks with Momentum
- Authors: Duo Li and Shang-Hua Gao
- Abstract summary: We propose a reversible neural network, termed as m-RevNet, that is characterized by inserting momentum update to residual blocks.
For certain learning scenarios, we analytically and empirically reveal that our m-RevNet succeeds while standard ResNet fails.
- Score: 25.609808975649624
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In recent years, the connections between deep residual networks and
first-order Ordinary Differential Equations (ODEs) have been disclosed. In this
work, we further bridge the deep neural architecture design with the
second-order ODEs and propose a novel reversible neural network, termed as
m-RevNet, that is characterized by inserting momentum update to residual
blocks. The reversible property allows us to perform backward pass without
access to activation values of the forward pass, greatly relieving the storage
burden during training. Furthermore, the theoretical foundation based on
second-order ODEs grants m-RevNet with stronger representational power than
vanilla residual networks, which potentially explains its performance gains.
For certain learning scenarios, we analytically and empirically reveal that our
m-RevNet succeeds while standard ResNet fails. Comprehensive experiments on
various image classification and semantic segmentation benchmarks demonstrate
the superiority of our m-RevNet over ResNet, concerning both memory efficiency
and recognition performance.
Related papers
- Reversible Decoupling Network for Single Image Reflection Removal [15.763420129991255]
High-level semantic clues tend to be compressed or discarded during layer-by-layer propagation.
We propose a novel architecture called Reversible Decoupling Network (RDNet)
RDNet employs a reversible encoder to secure valuable information while flexibly decoupling transmission- and reflection-relevant features during the forward pass.
arXiv Detail & Related papers (2024-10-10T15:58:27Z) - Dr$^2$Net: Dynamic Reversible Dual-Residual Networks for Memory-Efficient Finetuning [81.0108753452546]
We propose Dynamic Reversible Dual-Residual Networks, or Dr$2$Net, to finetune a pretrained model with substantially reduced memory consumption.
Dr$2$Net contains two types of residual connections, one maintaining the residual structure in the pretrained models, and the other making the network reversible.
We show that Dr$2$Net can reach comparable performance to conventional finetuning but with significantly less memory usage.
arXiv Detail & Related papers (2024-01-08T18:59:31Z) - Distribution-sensitive Information Retention for Accurate Binary Neural
Network [49.971345958676196]
We present a novel Distribution-sensitive Information Retention Network (DIR-Net) to retain the information of the forward activations and backward gradients.
Our DIR-Net consistently outperforms the SOTA binarization approaches under mainstream and compact architectures.
We conduct our DIR-Net on real-world resource-limited devices which achieves 11.1 times storage saving and 5.4 times speedup.
arXiv Detail & Related papers (2021-09-25T10:59:39Z) - CondenseNet V2: Sparse Feature Reactivation for Deep Networks [87.38447745642479]
Reusing features in deep networks through dense connectivity is an effective way to achieve high computational efficiency.
We propose an alternative approach named sparse feature reactivation (SFR), aiming at actively increasing the utility of features for reusing.
Our experiments show that the proposed models achieve promising performance on image classification (ImageNet and CIFAR) and object detection (MS COCO) in terms of both theoretical efficiency and practical speed.
arXiv Detail & Related papers (2021-04-09T14:12:43Z) - Momentum Residual Neural Networks [22.32840998053339]
We propose to change the forward rule of a ResNet by adding a momentum term.
MomentumNets can be used as a drop-in replacement for any existing ResNet block.
We show that MomentumNets have the same accuracy as ResNets, while having a much smaller memory footprint.
arXiv Detail & Related papers (2021-02-15T22:24:52Z) - Kernel-Based Smoothness Analysis of Residual Networks [85.20737467304994]
Residual networks (ResNets) stand out among these powerful modern architectures.
In this paper, we show another distinction between the two models, namely, a tendency of ResNets to promote smoothers than gradients.
arXiv Detail & Related papers (2020-09-21T16:32:04Z) - Implicit Euler ODE Networks for Single-Image Dehazing [33.34490764631837]
We propose an efficient end-to-end multi-level implicit network (MI-Net) for the single image dehazing problem.
Our method outperforms existing methods and achieves the state-of-the-art performance.
arXiv Detail & Related papers (2020-07-13T15:27:33Z) - Iterative Network for Image Super-Resolution [69.07361550998318]
Single image super-resolution (SISR) has been greatly revitalized by the recent development of convolutional neural networks (CNN)
This paper provides a new insight on conventional SISR algorithm, and proposes a substantially different approach relying on the iterative optimization.
A novel iterative super-resolution network (ISRN) is proposed on top of the iterative optimization.
arXiv Detail & Related papers (2020-05-20T11:11:47Z) - Lifted Regression/Reconstruction Networks [17.89437720094451]
We propose lifted regression/reconstruction networks (LRRNs)
LRRNs combine lifted neural networks with a guaranteed Lipschitz continuity property for the output layer.
We analyse and numerically demonstrate applications for unsupervised and supervised learning.
arXiv Detail & Related papers (2020-05-07T13:24:46Z) - ReActNet: Towards Precise Binary Neural Network with Generalized
Activation Functions [76.05981545084738]
We propose several ideas for enhancing a binary network to close its accuracy gap from real-valued networks without incurring any additional computational cost.
We first construct a baseline network by modifying and binarizing a compact real-valued network with parameter-free shortcuts.
We show that the proposed ReActNet outperforms all the state-of-the-arts by a large margin.
arXiv Detail & Related papers (2020-03-07T02:12:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.