Compacting Binary Neural Networks by Sparse Kernel Selection
- URL: http://arxiv.org/abs/2303.14470v1
- Date: Sat, 25 Mar 2023 13:53:02 GMT
- Title: Compacting Binary Neural Networks by Sparse Kernel Selection
- Authors: Yikai Wang, Wenbing Huang, Yinpeng Dong, Fuchun Sun, Anbang Yao
- Abstract summary: This paper is motivated by a previously revealed phenomenon that the binary kernels in successful BNNs are nearly power-law distributed.
We develop the Permutation Straight-Through Estimator (PSTE) that is able to not only optimize the selection process end-to-end but also maintain the non-repetitive occupancy of selected codewords.
Experiments verify that our method reduces both the model size and bit-wise computational costs, and achieves accuracy improvements compared with state-of-the-art BNNs under comparable budgets.
- Score: 58.84313343190488
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Binary Neural Network (BNN) represents convolution weights with 1-bit values,
which enhances the efficiency of storage and computation. This paper is
motivated by a previously revealed phenomenon that the binary kernels in
successful BNNs are nearly power-law distributed: their values are mostly
clustered into a small number of codewords. This phenomenon encourages us to
compact typical BNNs and obtain further close performance through learning
non-repetitive kernels within a binary kernel subspace. Specifically, we regard
the binarization process as kernel grouping in terms of a binary codebook, and
our task lies in learning to select a smaller subset of codewords from the full
codebook. We then leverage the Gumbel-Sinkhorn technique to approximate the
codeword selection process, and develop the Permutation Straight-Through
Estimator (PSTE) that is able to not only optimize the selection process
end-to-end but also maintain the non-repetitive occupancy of selected
codewords. Experiments verify that our method reduces both the model size and
bit-wise computational costs, and achieves accuracy improvements compared with
state-of-the-art BNNs under comparable budgets.
Related papers
- FOBNN: Fast Oblivious Binarized Neural Network Inference [12.587981899648419]
We develop a fast oblivious binarized neural network inference framework, FOBNN.
Specifically, we customize binarized convolutional neural networks to enhance oblivious inference, design two fast algorithms for binarized convolutions, and optimize network structures experimentally under constrained costs.
arXiv Detail & Related papers (2024-05-06T03:12:36Z) - Recurrent Bilinear Optimization for Binary Neural Networks [58.972212365275595]
BNNs neglect the intrinsic bilinear relationship of real-valued weights and scale factors.
Our work is the first attempt to optimize BNNs from the bilinear perspective.
We obtain robust RBONNs, which show impressive performance over state-of-the-art BNNs on various models and datasets.
arXiv Detail & Related papers (2022-09-04T06:45:33Z) - Towards Accurate Binary Neural Networks via Modeling Contextual
Dependencies [52.691032025163175]
Existing Binary Neural Networks (BNNs) operate mainly on local convolutions with binarization function.
We present new designs of binary neural modules, which enables leading binary neural modules by a large margin.
arXiv Detail & Related papers (2022-09-03T11:51:04Z) - AdaBin: Improving Binary Neural Networks with Adaptive Binary Sets [27.022212653067367]
This paper studies the Binary Neural Networks (BNNs) in which weights and activations are both binarized into 1-bit values.
We present a simple yet effective approach called AdaBin to adaptively obtain the optimal binary sets.
Experimental results on benchmark models and datasets demonstrate that the proposed AdaBin is able to achieve state-of-the-art performance.
arXiv Detail & Related papers (2022-08-17T05:43:33Z) - Sub-bit Neural Networks: Learning to Compress and Accelerate Binary
Neural Networks [72.81092567651395]
Sub-bit Neural Networks (SNNs) are a new type of binary quantization design tailored to compress and accelerate BNNs.
SNNs are trained with a kernel-aware optimization framework, which exploits binary quantization in the fine-grained convolutional kernel space.
Experiments on visual recognition benchmarks and the hardware deployment on FPGA validate the great potentials of SNNs.
arXiv Detail & Related papers (2021-10-18T11:30:29Z) - Quantized Neural Networks via {-1, +1} Encoding Decomposition and
Acceleration [83.84684675841167]
We propose a novel encoding scheme using -1, +1 to decompose quantized neural networks (QNNs) into multi-branch binary networks.
We validate the effectiveness of our method on large-scale image classification, object detection, and semantic segmentation tasks.
arXiv Detail & Related papers (2021-06-18T03:11:15Z) - FTBNN: Rethinking Non-linearity for 1-bit CNNs and Going Beyond [23.5996182207431]
We show that binarized convolution process owns an increasing linearity towards the target of minimizing such error, which in turn hampers BNN's discriminative ability.
We re-investigate and tune proper non-linear modules to fix that contradiction, leading to a strong baseline which achieves state-of-the-art performance.
arXiv Detail & Related papers (2020-10-19T08:11:48Z) - Binarizing MobileNet via Evolution-based Searching [66.94247681870125]
We propose a use of evolutionary search to facilitate the construction and training scheme when binarizing MobileNet.
Inspired by one-shot architecture search frameworks, we manipulate the idea of group convolution to design efficient 1-Bit Convolutional Neural Networks (CNNs)
Our objective is to come up with a tiny yet efficient binary neural architecture by exploring the best candidates of the group convolution.
arXiv Detail & Related papers (2020-05-13T13:25:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.