BiMLP: Compact Binary Architectures for Vision Multi-Layer Perceptrons
- URL: http://arxiv.org/abs/2212.14158v1
- Date: Thu, 29 Dec 2022 02:43:41 GMT
- Title: BiMLP: Compact Binary Architectures for Vision Multi-Layer Perceptrons
- Authors: Yixing Xu, Xinghao Chen, Yunhe Wang
- Abstract summary: This paper studies the problem of designing compact binary architectures for vision multi-layer perceptrons (MLPs)
We find that previous binarization methods perform poorly due to limited capacity of binary samplings.
We propose to improve the performance of binary mixing and channel mixing (BiMLP) model by enriching the representation ability of binary FC layers.
- Score: 37.28828605119602
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper studies the problem of designing compact binary architectures for
vision multi-layer perceptrons (MLPs). We provide extensive analysis on the
difficulty of binarizing vision MLPs and find that previous binarization
methods perform poorly due to limited capacity of binary MLPs. In contrast with
the traditional CNNs that utilizing convolutional operations with large kernel
size, fully-connected (FC) layers in MLPs can be treated as convolutional
layers with kernel size $1\times1$. Thus, the representation ability of the FC
layers will be limited when being binarized, and places restrictions on the
capability of spatial mixing and channel mixing on the intermediate features.
To this end, we propose to improve the performance of binary MLP (BiMLP) model
by enriching the representation ability of binary FC layers. We design a novel
binary block that contains multiple branches to merge a series of outputs from
the same stage, and also a universal shortcut connection that encourages the
information flow from the previous stage. The downsampling layers are also
carefully designed to reduce the computational complexity while maintaining the
classification performance. Experimental results on benchmark dataset
ImageNet-1k demonstrate the effectiveness of the proposed BiMLP models, which
achieve state-of-the-art accuracy compared to prior binary CNNs. The MindSpore
code is available at
\url{https://gitee.com/mindspore/models/tree/master/research/cv/BiMLP}.
Related papers
- SCHEME: Scalable Channel Mixer for Vision Transformers [52.605868919281086]
Vision Transformers have achieved impressive performance in many vision tasks.
Much less research has been devoted to the channel mixer or feature mixing block (FFN or)
We show that the dense connections can be replaced with a diagonal block structure that supports larger expansion ratios.
arXiv Detail & Related papers (2023-12-01T08:22:34Z) - Caterpillar: A Pure-MLP Architecture with Shifted-Pillars-Concatenation [68.24659910441736]
Shifted-Pillars-Concatenation (SPC) module offers superior local modeling power and performance gains.
We build a pure-MLP architecture called Caterpillar by replacing the convolutional layer with the SPC module in a hybrid model of sMLPNet.
Experiments show Caterpillar's excellent performance on both small-scale and ImageNet-1k classification benchmarks.
arXiv Detail & Related papers (2023-05-28T06:19:36Z) - UNeXt: MLP-based Rapid Medical Image Segmentation Network [80.16644725886968]
UNet and its latest extensions like TransUNet have been the leading medical image segmentation methods in recent years.
We propose UNeXt which is a Convolutional multilayer perceptron based network for image segmentation.
We show that we reduce the number of parameters by 72x, decrease the computational complexity by 68x, and improve the inference speed by 10x while also obtaining better segmentation performance.
arXiv Detail & Related papers (2022-03-09T18:58:22Z) - RepMLPNet: Hierarchical Vision MLP with Re-parameterized Locality [113.1414517605892]
We propose a methodology, Locality Injection, to incorporate local priors into an FC layer.
RepMLPNet is the first that seamlessly transfer to Cityscapes semantic segmentation.
arXiv Detail & Related papers (2021-12-21T10:28:17Z) - ConvMLP: Hierarchical Convolutional MLPs for Vision [7.874749885641495]
We propose a hierarchical ConMLP: a light-weight, stage-wise, co-design for visual recognition.
We show that ConvMLP can be seamlessly transferred and achieve competitive results with fewer parameters.
arXiv Detail & Related papers (2021-09-09T17:52:57Z) - Sparse-MLP: A Fully-MLP Architecture with Conditional Computation [7.901786481399378]
Mixture-of-Experts (MoE) with sparse conditional computation has been proved an effective architecture for scaling attention-based models to more parameters with comparable computation cost.
We propose Sparse-MLP, scaling the recent-Mixer model with MoE, to achieve a more-efficient architecture.
arXiv Detail & Related papers (2021-09-05T06:43:08Z) - CycleMLP: A MLP-like Architecture for Dense Prediction [26.74203747156439]
CycleMLP is a versatile backbone for visual recognition and dense predictions.
It can cope with various image sizes and achieves linear computational complexity to image size by using local windows.
CycleMLP aims to provide a competitive baseline on object detection, instance segmentation, and semantic segmentation for models.
arXiv Detail & Related papers (2021-07-21T17:23:06Z) - RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for
Image Recognition [123.59890802196797]
We propose RepMLP, a multi-layer-perceptron-style neural network building block for image recognition.
We construct convolutional layers inside a RepMLP during training and merge them into the FC for inference.
By inserting RepMLP in traditional CNN, we improve ResNets by 1.8% accuracy on ImageNet, 2.9% for face recognition, and 2.3% mIoU on Cityscapes with lower FLOPs.
arXiv Detail & Related papers (2021-05-05T06:17:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.