DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs
- URL: http://arxiv.org/abs/2403.19588v2
- Date: Wed, 7 Aug 2024 15:11:01 GMT
- Title: DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs
- Authors: Donghyun Kim, Byeongho Heo, Dongyoon Han,
- Abstract summary: This paper revives Densely Connected Convolutional Networks (DenseNets)
We believe DenseNets' potential was overlooked due to untouched training methods and traditional design elements not fully revealing their capabilities.
We provide empirical analyses that uncover the merits of the concatenation over additive shortcuts, steering a renewed preference towards DenseNet-style designs.
- Score: 30.412909498409192
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper revives Densely Connected Convolutional Networks (DenseNets) and reveals the underrated effectiveness over predominant ResNet-style architectures. We believe DenseNets' potential was overlooked due to untouched training methods and traditional design elements not fully revealing their capabilities. Our pilot study shows dense connections through concatenation are strong, demonstrating that DenseNets can be revitalized to compete with modern architectures. We methodically refine suboptimal components - architectural adjustments, block redesign, and improved training recipes towards widening DenseNets and boosting memory efficiency while keeping concatenation shortcuts. Our models, employing simple architectural elements, ultimately surpass Swin Transformer, ConvNeXt, and DeiT-III - key architectures in the residual learning lineage. Furthermore, our models exhibit near state-of-the-art performance on ImageNet-1K, competing with the very recent models and downstream tasks, ADE20k semantic segmentation, and COCO object detection/instance segmentation. Finally, we provide empirical analyses that uncover the merits of the concatenation over additive shortcuts, steering a renewed preference towards DenseNet-style designs. Our code is available at https://github.com/naver-ai/rdnet.
Related papers
- ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders [104.05133094625137]
We propose a fully convolutional masked autoencoder framework and a new Global Response Normalization layer.
This co-design of self-supervised learning techniques and architectural improvement results in a new model family called ConvNeXt V2, which significantly improves the performance of pure ConvNets.
arXiv Detail & Related papers (2023-01-02T18:59:31Z) - Receptive Field Refinement for Convolutional Neural Networks Reliably
Improves Predictive Performance [1.52292571922932]
We present a new approach to receptive field analysis that can yield these types of theoretical and empirical performance gains.
Our approach is able to improve ImageNet1K performance across a wide range of well-known, state-of-the-art (SOTA) model classes.
arXiv Detail & Related papers (2022-11-26T05:27:44Z) - Deep Learning without Shortcuts: Shaping the Kernel with Tailored
Rectifiers [83.74380713308605]
We develop a new type of transformation that is fully compatible with a variant of ReLUs -- Leaky ReLUs.
We show in experiments that our method, which introduces negligible extra computational cost, validation accuracies with deep vanilla networks that are competitive with ResNets.
arXiv Detail & Related papers (2022-03-15T17:49:08Z) - Revisiting ResNets: Improved Training and Scaling Strategies [54.0162571976267]
Training and scaling strategies may matter more than architectural changes, and the resulting ResNets match recent state-of-the-art models.
We show that the best performing scaling strategy depends on the training regime.
We design a family of ResNet architectures, ResNet-RS, which are 1.7x - 2.7x faster than EfficientNets on TPUs, while achieving similar accuracies on ImageNet.
arXiv Detail & Related papers (2021-03-13T00:18:19Z) - Dynamic Graph: Learning Instance-aware Connectivity for Neural Networks [78.65792427542672]
Dynamic Graph Network (DG-Net) is a complete directed acyclic graph, where the nodes represent convolutional blocks and the edges represent connection paths.
Instead of using the same path of the network, DG-Net aggregates features dynamically in each node, which allows the network to have more representation ability.
arXiv Detail & Related papers (2020-10-02T16:50:26Z) - Structured Convolutions for Efficient Neural Network Design [65.36569572213027]
We tackle model efficiency by exploiting redundancy in the textitimplicit structure of the building blocks of convolutional neural networks.
We show how this decomposition can be applied to 2D and 3D kernels as well as the fully-connected layers.
arXiv Detail & Related papers (2020-08-06T04:38:38Z) - Growing Efficient Deep Networks by Structured Continuous Sparsification [34.7523496790944]
We develop an approach to growing deep network architectures over the course of training.
Our method can start from a small, simple seed architecture and dynamically grow and prune both layers and filters.
We achieve $49.7%$ inference FLOPs and $47.4%$ training FLOPs savings compared to a baseline ResNet-50 on ImageNet.
arXiv Detail & Related papers (2020-07-30T10:03:47Z) - BiO-Net: Learning Recurrent Bi-directional Connections for
Encoder-Decoder Architecture [82.64881585566825]
We present a novel Bi-directional O-shape network (BiO-Net) that reuses the building blocks in a recurrent manner without introducing any extra parameters.
Our method significantly outperforms the vanilla U-Net as well as other state-of-the-art methods.
arXiv Detail & Related papers (2020-07-01T05:07:49Z) - Rethinking Depthwise Separable Convolutions: How Intra-Kernel
Correlations Lead to Improved MobileNets [6.09170287691728]
We introduce blueprint separable convolutions (BSConv) as highly efficient building blocks for CNNs.
They are motivated by quantitative analyses of kernel properties from trained models.
Our approach provides a thorough theoretical derivation, interpretation, and justification for the application of depthwise separable convolutions.
arXiv Detail & Related papers (2020-03-30T15:23:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.