RepMLPNet: Hierarchical Vision MLP with Re-parameterized Locality
- URL: http://arxiv.org/abs/2112.11081v1
- Date: Tue, 21 Dec 2021 10:28:17 GMT
- Title: RepMLPNet: Hierarchical Vision MLP with Re-parameterized Locality
- Authors: Xiaohan Ding, Honghao Chen, Xiangyu Zhang, Jungong Han, Guiguang Ding
- Abstract summary: We propose a methodology, Locality Injection, to incorporate local priors into an FC layer.
RepMLPNet is the first that seamlessly transfer to Cityscapes semantic segmentation.
- Score: 113.1414517605892
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Compared to convolutional layers, fully-connected (FC) layers are better at
modeling the long-range dependencies but worse at capturing the local patterns,
hence usually less favored for image recognition. In this paper, we propose a
methodology, Locality Injection, to incorporate local priors into an FC layer
via merging the trained parameters of a parallel conv kernel into the FC
kernel. Locality Injection can be viewed as a novel Structural
Re-parameterization method since it equivalently converts the structures via
transforming the parameters. Based on that, we propose a multi-layer-perceptron
(MLP) block named RepMLP Block, which uses three FC layers to extract features,
and a novel architecture named RepMLPNet. The hierarchical design distinguishes
RepMLPNet from the other concurrently proposed vision MLPs. As it produces
feature maps of different levels, it qualifies as a backbone model for
downstream tasks like semantic segmentation. Our results reveal that 1)
Locality Injection is a general methodology for MLP models; 2) RepMLPNet has
favorable accuracy-efficiency trade-off compared to the other MLPs; 3)
RepMLPNet is the first MLP that seamlessly transfer to Cityscapes semantic
segmentation. The code and models are available at
https://github.com/DingXiaoH/RepMLP.
Related papers
- Caterpillar: A Pure-MLP Architecture with Shifted-Pillars-Concatenation [68.24659910441736]
Shifted-Pillars-Concatenation (SPC) module offers superior local modeling power and performance gains.
We build a pure-MLP architecture called Caterpillar by replacing the convolutional layer with the SPC module in a hybrid model of sMLPNet.
Experiments show Caterpillar's excellent performance on both small-scale and ImageNet-1k classification benchmarks.
arXiv Detail & Related papers (2023-05-28T06:19:36Z) - BiMLP: Compact Binary Architectures for Vision Multi-Layer Perceptrons [37.28828605119602]
This paper studies the problem of designing compact binary architectures for vision multi-layer perceptrons (MLPs)
We find that previous binarization methods perform poorly due to limited capacity of binary samplings.
We propose to improve the performance of binary mixing and channel mixing (BiMLP) model by enriching the representation ability of binary FC layers.
arXiv Detail & Related papers (2022-12-29T02:43:41Z) - Sparse MLP for Image Recognition: Is Self-Attention Really Necessary? [65.37917850059017]
We build an attention-free network called sMLPNet.
For 2D image tokens, sMLP applies 1D along the axial directions and the parameters are shared among rows or columns.
When scaling up to 66M parameters, sMLPNet achieves 83.4% top-1 accuracy, which is on par with the state-of-the-art Swin Transformer.
arXiv Detail & Related papers (2021-09-12T04:05:15Z) - ConvMLP: Hierarchical Convolutional MLPs for Vision [7.874749885641495]
We propose a hierarchical ConMLP: a light-weight, stage-wise, co-design for visual recognition.
We show that ConvMLP can be seamlessly transferred and achieve competitive results with fewer parameters.
arXiv Detail & Related papers (2021-09-09T17:52:57Z) - Sparse-MLP: A Fully-MLP Architecture with Conditional Computation [7.901786481399378]
Mixture-of-Experts (MoE) with sparse conditional computation has been proved an effective architecture for scaling attention-based models to more parameters with comparable computation cost.
We propose Sparse-MLP, scaling the recent-Mixer model with MoE, to achieve a more-efficient architecture.
arXiv Detail & Related papers (2021-09-05T06:43:08Z) - Hire-MLP: Vision MLP via Hierarchical Rearrangement [58.33383667626998]
Hire-MLP is a simple yet competitive vision architecture via rearrangement.
The proposed Hire-MLP architecture is built with simple channel-mixing operations, thus enjoys high flexibility and inference speed.
Experiments show that our Hire-MLP achieves state-of-the-art performance on the ImageNet-1K benchmark.
arXiv Detail & Related papers (2021-08-30T16:11:04Z) - CycleMLP: A MLP-like Architecture for Dense Prediction [26.74203747156439]
CycleMLP is a versatile backbone for visual recognition and dense predictions.
It can cope with various image sizes and achieves linear computational complexity to image size by using local windows.
CycleMLP aims to provide a competitive baseline on object detection, instance segmentation, and semantic segmentation for models.
arXiv Detail & Related papers (2021-07-21T17:23:06Z) - AS-MLP: An Axial Shifted MLP Architecture for Vision [50.11765148947432]
An Axial Shifted architecture (AS-MLP) is proposed in this paper.
By axially shifting channels of the feature map, AS-MLP is able to obtain the information flow from different directions.
With the proposed AS-MLP architecture, our model obtains 83.3% Top-1 accuracy with 88M parameters and 15.2 GFLOPs on the ImageNet-1K dataset.
arXiv Detail & Related papers (2021-07-18T08:56:34Z) - RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for
Image Recognition [123.59890802196797]
We propose RepMLP, a multi-layer-perceptron-style neural network building block for image recognition.
We construct convolutional layers inside a RepMLP during training and merge them into the FC for inference.
By inserting RepMLP in traditional CNN, we improve ResNets by 1.8% accuracy on ImageNet, 2.9% for face recognition, and 2.3% mIoU on Cityscapes with lower FLOPs.
arXiv Detail & Related papers (2021-05-05T06:17:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.