QuantNet: Learning to Quantize by Learning within Fully Differentiable
Framework
- URL: http://arxiv.org/abs/2009.04626v1
- Date: Thu, 10 Sep 2020 01:41:05 GMT
- Title: QuantNet: Learning to Quantize by Learning within Fully Differentiable
Framework
- Authors: Junjie Liu, Dongchao Wen, Deyu Wang, Wei Tao, Tse-Wei Chen, Kinya Osa,
and Masami Kato
- Abstract summary: This paper proposes a meta-based quantizer named QuantNet, which utilizes a differentiable sub-network to directly binarize the full-precision weights.
Our method not only solves the problem of gradient mismatching, but also reduces the impact of discretization errors, caused by the binarizing operation in the deployment.
- Score: 32.465949985191635
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite the achievements of recent binarization methods on reducing the
performance degradation of Binary Neural Networks (BNNs), gradient mismatching
caused by the Straight-Through-Estimator (STE) still dominates quantized
networks. This paper proposes a meta-based quantizer named QuantNet, which
utilizes a differentiable sub-network to directly binarize the full-precision
weights without resorting to STE and any learnable gradient estimators. Our
method not only solves the problem of gradient mismatching, but also reduces
the impact of discretization errors, caused by the binarizing operation in the
deployment, on performance. Generally, the proposed algorithm is implemented
within a fully differentiable framework, and is easily extended to the general
network quantization with any bits. The quantitative experiments on CIFAR-100
and ImageNet demonstrate that QuantNet achieves the signifficant improvements
comparing with previous binarization methods, and even bridges gaps of
accuracies between binarized models and full-precision models.
Related papers
- QGait: Toward Accurate Quantization for Gait Recognition with Binarized Input [17.017127559393398]
We propose a differentiable soft quantizer, which better simulates the gradient of the round function during backpropagation.
This enables the network to learn from subtle input perturbations.
We further refine the training strategy to ensure convergence while simulating quantization errors.
arXiv Detail & Related papers (2024-05-22T17:34:18Z) - Binarizing Sparse Convolutional Networks for Efficient Point Cloud
Analysis [93.55896765176414]
We propose binary sparse convolutional networks called BSC-Net for efficient point cloud analysis.
We employ the differentiable search strategies to discover the optimal opsitions for active site matching in the shifted sparse convolution.
Our BSC-Net achieves significant improvement upon our srtong baseline and outperforms the state-of-the-art network binarization methods.
arXiv Detail & Related papers (2023-03-27T13:47:06Z) - BiTAT: Neural Network Binarization with Task-dependent Aggregated
Transformation [116.26521375592759]
Quantization aims to transform high-precision weights and activations of a given neural network into low-precision weights/activations for reduced memory usage and computation.
Extreme quantization (1-bit weight/1-bit activations) of compactly-designed backbone architectures results in severe performance degeneration.
This paper proposes a novel Quantization-Aware Training (QAT) method that can effectively alleviate performance degeneration.
arXiv Detail & Related papers (2022-07-04T13:25:49Z) - Bimodal Distributed Binarized Neural Networks [3.0778860202909657]
Binarization techniques, however, suffer from ineligible performance degradation compared to their full-precision counterparts.
We propose a Bi-Modal Distributed binarization method (methodname)
That imposes bi-modal distribution of the network weights by kurtosis regularization.
arXiv Detail & Related papers (2022-04-05T06:07:05Z) - Post-training Quantization for Neural Networks with Provable Guarantees [9.58246628652846]
We modify a post-training neural-network quantization method, GPFQ, that is based on a greedy path-following mechanism.
We prove that for quantizing a single-layer network, the relative square error essentially decays linearly in the number of weights.
arXiv Detail & Related papers (2022-01-26T18:47:38Z) - Cluster-Promoting Quantization with Bit-Drop for Minimizing Network
Quantization Loss [61.26793005355441]
Cluster-Promoting Quantization (CPQ) finds the optimal quantization grids for neural networks.
DropBits is a new bit-drop technique that revises the standard dropout regularization to randomly drop bits instead of neurons.
We experimentally validate our method on various benchmark datasets and network architectures.
arXiv Detail & Related papers (2021-09-05T15:15:07Z) - Direct Quantization for Training Highly Accurate Low Bit-width Deep
Neural Networks [73.29587731448345]
This paper proposes two novel techniques to train deep convolutional neural networks with low bit-width weights and activations.
First, to obtain low bit-width weights, most existing methods obtain the quantized weights by performing quantization on the full-precision network weights.
Second, to obtain low bit-width activations, existing works consider all channels equally.
arXiv Detail & Related papers (2020-12-26T15:21:18Z) - Where Should We Begin? A Low-Level Exploration of Weight Initialization
Impact on Quantized Behaviour of Deep Neural Networks [93.4221402881609]
We present an in-depth, fine-grained ablation study of the effect of different weights initialization on the final distributions of weights and activations of different CNN architectures.
To our best knowledge, we are the first to perform such a low-level, in-depth quantitative analysis of weights initialization and its effect on quantized behaviour.
arXiv Detail & Related papers (2020-11-30T06:54:28Z) - Fully Quantized Image Super-Resolution Networks [81.75002888152159]
We propose a Fully Quantized image Super-Resolution framework (FQSR) to jointly optimize efficiency and accuracy.
We apply our quantization scheme on multiple mainstream super-resolution architectures, including SRResNet, SRGAN and EDSR.
Our FQSR using low bits quantization can achieve on par performance compared with the full-precision counterparts on five benchmark datasets.
arXiv Detail & Related papers (2020-11-29T03:53:49Z) - AUSN: Approximately Uniform Quantization by Adaptively Superimposing
Non-uniform Distribution for Deep Neural Networks [0.7378164273177589]
Existing uniform and non-uniform quantization methods exhibit an inherent conflict between the representing range and representing resolution.
We propose a novel quantization method to quantize the weight and activation.
The key idea is to Approximate the Uniform quantization by Adaptively Superposing multiple Non-uniform quantized values, namely AUSN.
arXiv Detail & Related papers (2020-07-08T05:10:53Z) - Meta-Learning with Network Pruning [40.07436648243748]
We propose a network pruning based meta-learning approach for overfitting reduction via explicitly controlling the capacity of network.
We have implemented our approach on top of Reptile assembled with two network pruning routines: Dense-Sparse-Dense (DSD) and Iterative Hard Thresholding (IHT)
arXiv Detail & Related papers (2020-07-07T06:13:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.