Symmetry Regularization and Saturating Nonlinearity for Robust
Quantization
- URL: http://arxiv.org/abs/2208.00338v1
- Date: Sun, 31 Jul 2022 02:12:28 GMT
- Title: Symmetry Regularization and Saturating Nonlinearity for Robust
Quantization
- Authors: Sein Park, Yeongsang Jang and Eunhyeok Park
- Abstract summary: We present three insights to robustify a network against quantization.
We propose two novel methods called symmetry regularization (SymReg) and saturating nonlinearity (SatNL)
Applying the proposed methods during training can enhance the robustness of arbitrary neural networks against quantization.
- Score: 5.1779694507922835
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Robust quantization improves the tolerance of networks for various
implementations, allowing reliable output in different bit-widths or fragmented
low-precision arithmetic. In this work, we perform extensive analyses to
identify the sources of quantization error and present three insights to
robustify a network against quantization: reduction of error propagation, range
clamping for error minimization, and inherited robustness against quantization.
Based on these insights, we propose two novel methods called symmetry
regularization (SymReg) and saturating nonlinearity (SatNL). Applying the
proposed methods during training can enhance the robustness of arbitrary neural
networks against quantization on existing post-training quantization (PTQ) and
quantization-aware training (QAT) algorithms and enables us to obtain a single
weight flexible enough to maintain the output quality under various conditions.
We conduct extensive studies on CIFAR and ImageNet datasets and validate the
effectiveness of the proposed methods.
Related papers
- Pushing the Limits of Large Language Model Quantization via the Linearity Theorem [71.3332971315821]
We present a "line theoremarity" establishing a direct relationship between the layer-wise $ell$ reconstruction error and the model perplexity increase due to quantization.
This insight enables two novel applications: (1) a simple data-free LLM quantization method using Hadamard rotations and MSE-optimal grids, dubbed HIGGS, and (2) an optimal solution to the problem of finding non-uniform per-layer quantization levels.
arXiv Detail & Related papers (2024-11-26T15:35:44Z) - Quantization-aware Interval Bound Propagation for Training Certifiably
Robust Quantized Neural Networks [58.195261590442406]
We study the problem of training and certifying adversarially robust quantized neural networks (QNNs)
Recent work has shown that floating-point neural networks that have been verified to be robust can become vulnerable to adversarial attacks after quantization.
We present quantization-aware interval bound propagation (QA-IBP), a novel method for training robust QNNs.
arXiv Detail & Related papers (2022-11-29T13:32:38Z) - NIPQ: Noise proxy-based Integrated Pseudo-Quantization [9.207644534257543]
Straight-through estimator (STE) incurs unstable convergence during quantization-aware training (QAT)
We propose a novel noise proxy-based integrated pseudoquantization (NIPQ) that enables unified support of pseudoquantization for both activation and weight.
NIPQ outperforms existing quantization algorithms in various vision and language applications by a large margin.
arXiv Detail & Related papers (2022-06-02T01:17:40Z) - Post-training Quantization for Neural Networks with Provable Guarantees [9.58246628652846]
We modify a post-training neural-network quantization method, GPFQ, that is based on a greedy path-following mechanism.
We prove that for quantizing a single-layer network, the relative square error essentially decays linearly in the number of weights.
arXiv Detail & Related papers (2022-01-26T18:47:38Z) - Nonuniform-to-Uniform Quantization: Towards Accurate Quantization via
Generalized Straight-Through Estimation [48.838691414561694]
Nonuniform-to-Uniform Quantization (N2UQ) is a method that can maintain the strong representation ability of nonuniform methods while being hardware-friendly and efficient.
N2UQ outperforms state-of-the-art nonuniform quantization methods by 0.71.8% on ImageNet.
arXiv Detail & Related papers (2021-11-29T18:59:55Z) - Cluster-Promoting Quantization with Bit-Drop for Minimizing Network
Quantization Loss [61.26793005355441]
Cluster-Promoting Quantization (CPQ) finds the optimal quantization grids for neural networks.
DropBits is a new bit-drop technique that revises the standard dropout regularization to randomly drop bits instead of neurons.
We experimentally validate our method on various benchmark datasets and network architectures.
arXiv Detail & Related papers (2021-09-05T15:15:07Z) - DAQ: Distribution-Aware Quantization for Deep Image Super-Resolution
Networks [49.191062785007006]
Quantizing deep convolutional neural networks for image super-resolution substantially reduces their computational costs.
Existing works either suffer from a severe performance drop in ultra-low precision of 4 or lower bit-widths, or require a heavy fine-tuning process to recover the performance.
We propose a novel distribution-aware quantization scheme (DAQ) which facilitates accurate training-free quantization in ultra-low precision.
arXiv Detail & Related papers (2020-12-21T10:19:42Z) - Recurrence of Optimum for Training Weight and Activation Quantized
Networks [4.103701929881022]
Training deep learning models with low-precision weights and activations involves a demanding optimization task.
We show how to overcome the nature of network quantization.
We also show numerical evidence of the recurrence phenomenon of weight evolution in training quantized deep networks.
arXiv Detail & Related papers (2020-12-10T09:14:43Z) - AUSN: Approximately Uniform Quantization by Adaptively Superimposing
Non-uniform Distribution for Deep Neural Networks [0.7378164273177589]
Existing uniform and non-uniform quantization methods exhibit an inherent conflict between the representing range and representing resolution.
We propose a novel quantization method to quantize the weight and activation.
The key idea is to Approximate the Uniform quantization by Adaptively Superposing multiple Non-uniform quantized values, namely AUSN.
arXiv Detail & Related papers (2020-07-08T05:10:53Z) - Gradient $\ell_1$ Regularization for Quantization Robustness [70.39776106458858]
We derive a simple regularization scheme that improves robustness against post-training quantization.
By training quantization-ready networks, our approach enables storing a single set of weights that can be quantized on-demand to different bit-widths.
arXiv Detail & Related papers (2020-02-18T12:31:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.