QEBVerif: Quantization Error Bound Verification of Neural Networks
- URL: http://arxiv.org/abs/2212.02781v2
- Date: Tue, 23 May 2023 14:06:13 GMT
- Title: QEBVerif: Quantization Error Bound Verification of Neural Networks
- Authors: Yedi Zhang and Fu Song and Jun Sun
- Abstract summary: quantization is widely regarded as one promising technique for deploying deep neural networks (DNNs) on edge devices.
Existing verification methods focus on either individual neural networks (DNNs or QNNs) or quantization error bound for partial quantization.
We propose a quantization error bound verification method, named QEBVerif, where both weights and activation tensors are quantized.
- Score: 6.327780998441913
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To alleviate the practical constraints for deploying deep neural networks
(DNNs) on edge devices, quantization is widely regarded as one promising
technique. It reduces the resource requirements for computational power and
storage space by quantizing the weights and/or activation tensors of a DNN into
lower bit-width fixed-point numbers, resulting in quantized neural networks
(QNNs). While it has been empirically shown to introduce minor accuracy loss,
critical verified properties of a DNN might become invalid once quantized.
Existing verification methods focus on either individual neural networks (DNNs
or QNNs) or quantization error bound for partial quantization. In this work, we
propose a quantization error bound verification method, named QEBVerif, where
both weights and activation tensors are quantized. QEBVerif consists of two
parts, i.e., a differential reachability analysis (DRA) and a mixed-integer
linear programming (MILP) based verification method. DRA performs difference
analysis between the DNN and its quantized counterpart layer-by-layer to
compute a tight quantization error interval efficiently. If DRA fails to prove
the error bound, then we encode the verification problem into an equivalent
MILP problem which can be solved by off-the-shelf solvers. Thus, QEBVerif is
sound, complete, and reasonably efficient. We implement QEBVerif and conduct
extensive experiments, showing its effectiveness and efficiency.
Related papers
- SPFQ: A Stochastic Algorithm and Its Error Analysis for Neural Network
Quantization [5.982922468400901]
We show that it is possible to achieve error bounds equivalent to that obtained in the order of the weights of a neural layer.
We prove that it is possible to achieve full-network bounds under an infinite alphabet and minimal assumptions on the input data.
arXiv Detail & Related papers (2023-09-20T00:35:16Z) - QVIP: An ILP-based Formal Verification Approach for Quantized Neural
Networks [14.766917269393865]
Quantization has emerged as a promising technique to reduce the size of neural networks with comparable accuracy as their floating-point numbered counterparts.
We propose a novel and efficient formal verification approach for QNNs.
In particular, we are the first to propose an encoding that reduces the verification problem of QNNs into the solving of integer linear constraints.
arXiv Detail & Related papers (2022-12-10T03:00:29Z) - Quantization-aware Interval Bound Propagation for Training Certifiably
Robust Quantized Neural Networks [58.195261590442406]
We study the problem of training and certifying adversarially robust quantized neural networks (QNNs)
Recent work has shown that floating-point neural networks that have been verified to be robust can become vulnerable to adversarial attacks after quantization.
We present quantization-aware interval bound propagation (QA-IBP), a novel method for training robust QNNs.
arXiv Detail & Related papers (2022-11-29T13:32:38Z) - Post-training Quantization for Neural Networks with Provable Guarantees [9.58246628652846]
We modify a post-training neural-network quantization method, GPFQ, that is based on a greedy path-following mechanism.
We prove that for quantizing a single-layer network, the relative square error essentially decays linearly in the number of weights.
arXiv Detail & Related papers (2022-01-26T18:47:38Z) - Cluster-Promoting Quantization with Bit-Drop for Minimizing Network
Quantization Loss [61.26793005355441]
Cluster-Promoting Quantization (CPQ) finds the optimal quantization grids for neural networks.
DropBits is a new bit-drop technique that revises the standard dropout regularization to randomly drop bits instead of neurons.
We experimentally validate our method on various benchmark datasets and network architectures.
arXiv Detail & Related papers (2021-09-05T15:15:07Z) - Filter Pre-Pruning for Improved Fine-tuning of Quantized Deep Neural
Networks [0.0]
We propose a new pruning method called Pruning for Quantization (PfQ) which removes the filters that disturb the fine-tuning of the DNN.
Experiments using well-known models and datasets confirmed that the proposed method achieves higher performance with a similar model size.
arXiv Detail & Related papers (2020-11-13T04:12:54Z) - Toward Trainability of Quantum Neural Networks [87.04438831673063]
Quantum Neural Networks (QNNs) have been proposed as generalizations of classical neural networks to achieve the quantum speed-up.
Serious bottlenecks exist for training QNNs due to the vanishing with gradient rate exponential to the input qubit number.
We show that QNNs with tree tensor and step controlled structures for the application of binary classification. Simulations show faster convergent rates and better accuracy compared to QNNs with random structures.
arXiv Detail & Related papers (2020-11-12T08:32:04Z) - On the learnability of quantum neural networks [132.1981461292324]
We consider the learnability of the quantum neural network (QNN) built on the variational hybrid quantum-classical scheme.
We show that if a concept can be efficiently learned by QNN, then it can also be effectively learned by QNN even with gate noise.
arXiv Detail & Related papers (2020-07-24T06:34:34Z) - AQD: Towards Accurate Fully-Quantized Object Detection [94.06347866374927]
We propose an Accurate Quantized object Detection solution, termed AQD, to get rid of floating-point computation.
Our AQD achieves comparable or even better performance compared with the full-precision counterpart under extremely low-bit schemes.
arXiv Detail & Related papers (2020-07-14T09:07:29Z) - AUSN: Approximately Uniform Quantization by Adaptively Superimposing
Non-uniform Distribution for Deep Neural Networks [0.7378164273177589]
Existing uniform and non-uniform quantization methods exhibit an inherent conflict between the representing range and representing resolution.
We propose a novel quantization method to quantize the weight and activation.
The key idea is to Approximate the Uniform quantization by Adaptively Superposing multiple Non-uniform quantized values, namely AUSN.
arXiv Detail & Related papers (2020-07-08T05:10:53Z) - Widening and Squeezing: Towards Accurate and Efficient QNNs [125.172220129257]
Quantization neural networks (QNNs) are very attractive to the industry because their extremely cheap calculation and storage overhead, but their performance is still worse than that of networks with full-precision parameters.
Most of existing methods aim to enhance performance of QNNs especially binary neural networks by exploiting more effective training techniques.
We address this problem by projecting features in original full-precision networks to high-dimensional quantization features.
arXiv Detail & Related papers (2020-02-03T04:11:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.