Soft Threshold Ternary Networks
- URL: http://arxiv.org/abs/2204.01234v1
- Date: Mon, 4 Apr 2022 04:43:42 GMT
- Title: Soft Threshold Ternary Networks
- Authors: Weixiang Xu, Xiangyu He, Tianli Zhao, Qinghao Hu, Peisong Wang and
Jian Cheng
- Abstract summary: In previous ternarized neural networks, a hard threshold Delta is introduced to determine quantization intervals.
We present the Soft Threshold Ternary Networks (STTN), which enables the model to automatically determine quantization intervals.
Our method dramatically outperforms current state-of-the-arts, lowering the performance gap between full-precision networks and extreme low bit networks.
- Score: 36.722958963130665
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large neural networks are difficult to deploy on mobile devices because of
intensive computation and storage. To alleviate it, we study ternarization, a
balance between efficiency and accuracy that quantizes both weights and
activations into ternary values. In previous ternarized neural networks, a hard
threshold {\Delta} is introduced to determine quantization intervals. Although
the selection of {\Delta} greatly affects the training results, previous works
estimate {\Delta} via an approximation or treat it as a hyper-parameter, which
is suboptimal. In this paper, we present the Soft Threshold Ternary Networks
(STTN), which enables the model to automatically determine quantization
intervals instead of depending on a hard threshold. Concretely, we replace the
original ternary kernel with the addition of two binary kernels at training
time, where ternary values are determined by the combination of two
corresponding binary values. At inference time, we add up the two binary
kernels to obtain a single ternary kernel. Our method dramatically outperforms
current state-of-the-arts, lowering the performance gap between full-precision
networks and extreme low bit networks. Experiments on ImageNet with ResNet-18
(Top-1 66.2%) achieves new state-of-the-art.
Update: In this version, we further fine-tune the experimental
hyperparameters and training procedure. The latest STTN shows that ResNet-18
with ternary weights and ternary activations achieves up to 68.2% Top-1
accuracy on ImageNet. Code is available at: github.com/WeixiangXu/STTN.
Related papers
- BiPer: Binary Neural Networks using a Periodic Function [17.461853355858022]
Quantized neural networks employ reduced precision representations for both weights and activations.
Binary Neural Networks (BNNs) are the extreme quantization case, representing values with just one bit.
In contrast to current BNN approaches, we propose to employ a binary periodic (BiPer) function during binarization.
arXiv Detail & Related papers (2024-04-01T17:52:17Z) - Speed Limits for Deep Learning [67.69149326107103]
Recent advancement in thermodynamics allows bounding the speed at which one can go from the initial weight distribution to the final distribution of the fully trained network.
We provide analytical expressions for these speed limits for linear and linearizable neural networks.
Remarkably, given some plausible scaling assumptions on the NTK spectra and spectral decomposition of the labels -- learning is optimal in a scaling sense.
arXiv Detail & Related papers (2023-07-27T06:59:46Z) - Globally Optimal Training of Neural Networks with Threshold Activation
Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations.
We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z) - RedBit: An End-to-End Flexible Framework for Evaluating the Accuracy of
Quantized CNNs [9.807687918954763]
Convolutional Neural Networks (CNNs) have become the standard class of deep neural network for image processing, classification and segmentation tasks.
RedBit is an open-source framework that provides a transparent, easy-to-use interface to evaluate the effectiveness of different algorithms on network accuracy.
arXiv Detail & Related papers (2023-01-15T21:27:35Z) - AdaBin: Improving Binary Neural Networks with Adaptive Binary Sets [27.022212653067367]
This paper studies the Binary Neural Networks (BNNs) in which weights and activations are both binarized into 1-bit values.
We present a simple yet effective approach called AdaBin to adaptively obtain the optimal binary sets.
Experimental results on benchmark models and datasets demonstrate that the proposed AdaBin is able to achieve state-of-the-art performance.
arXiv Detail & Related papers (2022-08-17T05:43:33Z) - FATNN: Fast and Accurate Ternary Neural Networks [89.07796377047619]
Ternary Neural Networks (TNNs) have received much attention due to being potentially orders of magnitude faster in inference, as well as more power efficient, than full-precision counterparts.
In this work, we show that, under some mild constraints, computational complexity of the ternary inner product can be reduced by a factor of 2.
We elaborately design an implementation-dependent ternary quantization algorithm to mitigate the performance gap.
arXiv Detail & Related papers (2020-08-12T04:26:18Z) - Efficient Integer-Arithmetic-Only Convolutional Neural Networks [87.01739569518513]
We replace conventional ReLU with Bounded ReLU and find that the decline is due to activation quantization.
Our integer networks achieve equivalent performance as the corresponding FPN networks, but have only 1/4 memory cost and run 2x faster on modern GPU.
arXiv Detail & Related papers (2020-06-21T08:23:03Z) - Learning Sparse & Ternary Neural Networks with Entropy-Constrained
Trained Ternarization (EC2T) [17.13246260883765]
Deep neural networks (DNNs) have shown remarkable success in a variety of machine learning applications.
In recent years, there is an increasing interest in deploying DNNs to resource-constrained devices with limited energy, memory, and computational budget.
We propose Entropy-Constrained Trained Ternarization (EC2T), a general framework to create sparse and ternary neural networks.
arXiv Detail & Related papers (2020-04-02T15:38:00Z) - Training Binary Neural Networks with Real-to-Binary Convolutions [52.91164959767517]
We show how to train binary networks to within a few percent points of the full precision counterpart.
We show how to build a strong baseline, which already achieves state-of-the-art accuracy.
We show that, when putting all of our improvements together, the proposed model beats the current state of the art by more than 5% top-1 accuracy on ImageNet.
arXiv Detail & Related papers (2020-03-25T17:54:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.