BiViT: Extremely Compressed Binary Vision Transformer
- URL: http://arxiv.org/abs/2211.07091v2
- Date: Thu, 5 Oct 2023 07:59:22 GMT
- Title: BiViT: Extremely Compressed Binary Vision Transformer
- Authors: Yefei He, Zhenyu Lou, Luoming Zhang, Jing Liu, Weijia Wu, Hong Zhou,
Bohan Zhuang
- Abstract summary: We propose to solve two fundamental challenges to push the horizon of Binary Vision Transformers (BiViT)
We propose Softmax-aware Binarization, which dynamically adapts to the data distribution and reduces the error caused by binarization.
Our method performs favorably against state-of-the-arts by 19.8% on the TinyImageNet dataset.
- Score: 19.985314022860432
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Model binarization can significantly compress model size, reduce energy
consumption, and accelerate inference through efficient bit-wise operations.
Although binarizing convolutional neural networks have been extensively
studied, there is little work on exploring binarization of vision Transformers
which underpin most recent breakthroughs in visual recognition. To this end, we
propose to solve two fundamental challenges to push the horizon of Binary
Vision Transformers (BiViT). First, the traditional binary method does not take
the long-tailed distribution of softmax attention into consideration, bringing
large binarization errors in the attention module. To solve this, we propose
Softmax-aware Binarization, which dynamically adapts to the data distribution
and reduces the error caused by binarization. Second, to better preserve the
information of the pretrained model and restore accuracy, we propose a
Cross-layer Binarization scheme that decouples the binarization of
self-attention and multi-layer perceptrons (MLPs), and Parameterized Weight
Scales which introduce learnable scaling factors for weight binarization.
Overall, our method performs favorably against state-of-the-arts by 19.8% on
the TinyImageNet dataset. On ImageNet, our BiViT achieves a competitive 75.6%
Top-1 accuracy over Swin-S model. Additionally, on COCO object detection, our
method achieves an mAP of 40.8 with a Swin-T backbone over Cascade Mask R-CNN
framework.
Related papers
- BiDense: Binarization for Dense Prediction [62.70804353158387]
BiDense is a generalized binary neural network (BNN) designed for efficient and accurate dense prediction tasks.
BiDense incorporates two key techniques: the Distribution-adaptive Binarizer (DAB) and the Channel-adaptive Full-precision Bypass (CFB)
arXiv Detail & Related papers (2024-11-15T16:46:04Z) - Neuromorphic Synergy for Video Binarization [54.195375576583864]
Bimodal objects serve as a visual form to embed information that can be easily recognized by vision systems.
Neuromorphic cameras offer new capabilities for alleviating motion blur, but it is non-trivial to first de-blur and then binarize the images in a real-time manner.
We propose an event-based binary reconstruction method that leverages the prior knowledge of the bimodal target's properties to perform inference independently in both event space and image space.
We also develop an efficient integration method to propagate this binary image to high frame rate binary video.
arXiv Detail & Related papers (2024-02-20T01:43:51Z) - ConvNeXt-ChARM: ConvNeXt-based Transform for Efficient Neural Image
Compression [18.05997169440533]
We propose ConvNeXt-ChARM, an efficient ConvNeXt-based transform coding framework, paired with a compute-efficient channel-wise auto-regressive auto-regressive.
We show that ConvNeXt-ChARM brings consistent and significant BD-rate (PSNR) reductions estimated on average to 5.24% and 1.22% over the versatile video coding (VVC) reference encoder (VTM-18.0) and the state-of-the-art learned image compression method SwinT-ChARM.
arXiv Detail & Related papers (2023-07-12T11:45:54Z) - BinaryViT: Towards Efficient and Accurate Binary Vision Transformers [4.339315098369913]
Vision Transformers (ViTs) have emerged as the fundamental architecture for most computer vision fields.
As one of the most powerful compression methods, binarization reduces the computation of the neural network by quantizing the weights and activation values as $pm$1.
Existing binarization methods have demonstrated excellent performance on CNNs, but the full binarization of ViTs is still under-studied and suffering a significant performance drop.
arXiv Detail & Related papers (2023-05-24T05:06:59Z) - Binarized Spectral Compressive Imaging [59.18636040850608]
Existing deep learning models for hyperspectral image (HSI) reconstruction achieve good performance but require powerful hardwares with enormous memory and computational resources.
We propose a novel method, Binarized Spectral-Redistribution Network (BiSRNet)
BiSRNet is derived by using the proposed techniques to binarize the base model.
arXiv Detail & Related papers (2023-05-17T15:36:08Z) - GSB: Group Superposition Binarization for Vision Transformer with
Limited Training Samples [46.025105938192624]
Vision Transformer (ViT) has performed remarkably in various computer vision tasks.
ViT usually suffers from serious overfitting problems with a relatively limited number of training samples.
We propose a novel model binarization technique, called Group Superposition Binarization (GSB)
arXiv Detail & Related papers (2023-05-13T14:48:09Z) - BiFSMNv2: Pushing Binary Neural Networks for Keyword Spotting to
Real-Network Performance [54.214426436283134]
Deep neural networks, such as the Deep-FSMN, have been widely studied for keyword spotting (KWS) applications.
We present a strong yet efficient binary neural network for KWS, namely BiFSMNv2, pushing it to the real-network accuracy performance.
We highlight that benefiting from the compact architecture and optimized hardware kernel, BiFSMNv2 can achieve an impressive 25.1x speedup and 20.2x storage-saving on edge hardware.
arXiv Detail & Related papers (2022-11-13T18:31:45Z) - Bimodal Distributed Binarized Neural Networks [3.0778860202909657]
Binarization techniques, however, suffer from ineligible performance degradation compared to their full-precision counterparts.
We propose a Bi-Modal Distributed binarization method (methodname)
That imposes bi-modal distribution of the network weights by kurtosis regularization.
arXiv Detail & Related papers (2022-04-05T06:07:05Z) - Binarized Weight Error Networks With a Transition Regularization Term [4.56877715768796]
This paper proposes a novel binarized weight network (BT) for a resource-efficient neural structure.
The proposed model estimates a binary representation of weights by taking into account the approximation error with an additional term.
A novel regularization term is introduced that is suitable for all threshold-based binary precision networks.
arXiv Detail & Related papers (2021-05-09T10:11:26Z) - Training Binary Neural Networks with Real-to-Binary Convolutions [52.91164959767517]
We show how to train binary networks to within a few percent points of the full precision counterpart.
We show how to build a strong baseline, which already achieves state-of-the-art accuracy.
We show that, when putting all of our improvements together, the proposed model beats the current state of the art by more than 5% top-1 accuracy on ImageNet.
arXiv Detail & Related papers (2020-03-25T17:54:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.