Bayesian Bits: Unifying Quantization and Pruning
- URL: http://arxiv.org/abs/2005.07093v3
- Date: Tue, 27 Oct 2020 11:27:24 GMT
- Title: Bayesian Bits: Unifying Quantization and Pruning
- Authors: Mart van Baalen and Christos Louizos and Markus Nagel and Rana Ali
Amjad and Ying Wang and Tijmen Blankevoort and Max Welling
- Abstract summary: We introduce Bayesian Bits, a practical method for joint mixed precision quantization and pruning through gradient based optimization.
We experimentally validate our proposed method on several benchmark datasets and show that we can learn pruned, mixed precision networks.
- Score: 73.27732135853243
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce Bayesian Bits, a practical method for joint mixed precision
quantization and pruning through gradient based optimization. Bayesian Bits
employs a novel decomposition of the quantization operation, which sequentially
considers doubling the bit width. At each new bit width, the residual error
between the full precision value and the previously rounded value is quantized.
We then decide whether or not to add this quantized residual error for a higher
effective bit width and lower quantization noise. By starting with a
power-of-two bit width, this decomposition will always produce
hardware-friendly configurations, and through an additional 0-bit option,
serves as a unified view of pruning and quantization. Bayesian Bits then
introduces learnable stochastic gates, which collectively control the bit width
of the given tensor. As a result, we can obtain low bit solutions by performing
approximate inference over the gates, with prior distributions that encourage
most of them to be switched off. We experimentally validate our proposed method
on several benchmark datasets and show that we can learn pruned, mixed
precision networks that provide a better trade-off between accuracy and
efficiency than their static bit width equivalents.
Related papers
- Verification of Geometric Robustness of Neural Networks via Piecewise Linear Approximation and Lipschitz Optimisation [57.10353686244835]
We address the problem of verifying neural networks against geometric transformations of the input image, including rotation, scaling, shearing, and translation.
The proposed method computes provably sound piecewise linear constraints for the pixel values by using sampling and linear approximations in combination with branch-and-bound Lipschitz.
We show that our proposed implementation resolves up to 32% more verification cases than present approaches.
arXiv Detail & Related papers (2024-08-23T15:02:09Z) - DB-LLM: Accurate Dual-Binarization for Efficient LLMs [83.70686728471547]
Large language models (LLMs) have significantly advanced the field of natural language processing.
Existing ultra-low-bit quantization always causes severe accuracy drops.
We propose a novel Dual-Binarization method for LLMs, namely DB-LLM.
arXiv Detail & Related papers (2024-02-19T09:04:30Z) - MixQuant: Mixed Precision Quantization with a Bit-width Optimization
Search [7.564770908909927]
Quantization is a technique for creating efficient Deep Neural Networks (DNNs)
We propose MixQuant, a search algorithm that finds the optimal custom quantization bit-width for each layer weight based on roundoff error.
We show that combining MixQuant with BRECQ, a state-of-the-art quantization method, yields better quantized model accuracy than BRECQ alone.
arXiv Detail & Related papers (2023-09-29T15:49:54Z) - MBQuant: A Novel Multi-Branch Topology Method for Arbitrary Bit-width Network Quantization [51.85834744835766]
We propose MBQuant, a novel method for arbitrary bit-width quantization.
We show that MBQuant achieves significant performance gains compared to existing arbitrary bit-width quantization methods.
arXiv Detail & Related papers (2023-05-14T10:17:09Z) - Quantized Neural Networks for Low-Precision Accumulation with Guaranteed
Overflow Avoidance [68.8204255655161]
We introduce a quantization-aware training algorithm that guarantees avoiding numerical overflow when reducing the precision of accumulators during inference.
We evaluate our algorithm across multiple quantized models that we train for different tasks, showing that our approach can reduce the precision of accumulators while maintaining model accuracy with respect to a floating-point baseline.
arXiv Detail & Related papers (2023-01-31T02:46:57Z) - Cluster-Promoting Quantization with Bit-Drop for Minimizing Network
Quantization Loss [61.26793005355441]
Cluster-Promoting Quantization (CPQ) finds the optimal quantization grids for neural networks.
DropBits is a new bit-drop technique that revises the standard dropout regularization to randomly drop bits instead of neurons.
We experimentally validate our method on various benchmark datasets and network architectures.
arXiv Detail & Related papers (2021-09-05T15:15:07Z) - FracBits: Mixed Precision Quantization via Fractional Bit-Widths [29.72454879490227]
Mixed precision quantization is favorable with customized hardwares supporting arithmetic operations at multiple bit-widths.
We propose a novel learning-based algorithm to derive mixed precision models end-to-end under target computation constraints.
arXiv Detail & Related papers (2020-07-04T06:09:09Z) - Post-Training Piecewise Linear Quantization for Deep Neural Networks [13.717228230596167]
Quantization plays an important role in the energy-efficient deployment of deep neural networks on resource-limited devices.
We propose a piecewise linear quantization scheme to enable accurate approximation for tensor values that have bell-shaped distributions with long tails.
Compared to state-of-the-art post-training quantization methods, our proposed method achieves superior performance on image classification, semantic segmentation, and object detection with minor overhead.
arXiv Detail & Related papers (2020-01-31T23:47:00Z) - Least squares binary quantization of neural networks [19.818087225770967]
We focus on the binary quantization, in which values are mapped to -1 and 1.
Inspired by the pareto-optimality of 2-bits versus 1-bit quantization, we introduce a novel 2-bits quantization with provably least squares error.
arXiv Detail & Related papers (2020-01-09T00:01:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.