Least squares binary quantization of neural networks
        - URL: http://arxiv.org/abs/2001.02786v3
- Date: Sat, 13 Jun 2020 07:23:03 GMT
- Title: Least squares binary quantization of neural networks
- Authors: Hadi Pouransari, Zhucheng Tu, Oncel Tuzel
- Abstract summary: We focus on the binary quantization, in which values are mapped to -1 and 1.
Inspired by the pareto-optimality of 2-bits versus 1-bit quantization, we introduce a novel 2-bits quantization with provably least squares error.
- Score: 19.818087225770967
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract:   Quantizing weights and activations of deep neural networks results in
significant improvement in inference efficiency at the cost of lower accuracy.
A source of the accuracy gap between full precision and quantized models is the
quantization error. In this work, we focus on the binary quantization, in which
values are mapped to -1 and 1. We provide a unified framework to analyze
different scaling strategies. Inspired by the pareto-optimality of 2-bits
versus 1-bit quantization, we introduce a novel 2-bits quantization with
provably least squares error. Our quantization algorithms can be implemented
efficiently on the hardware using bitwise operations. We present proofs to show
that our proposed methods are optimal, and also provide empirical error
analysis. We conduct experiments on the ImageNet dataset and show a reduced
accuracy gap when using the proposed least squares quantization algorithms.
 
      
        Related papers
        - ParetoQ: Scaling Laws in Extremely Low-bit LLM Quantization [58.84018707089315]
 We present a unified framework for rigorous comparisons across 1-bit, 1.58-bit, 2-bit, 3-bit, and 4-bit quantization settings.
We show that ternary, 2-bit, and 3-bit quantization maintains comparable performance in the size-accuracy trade-off.
Considering hardware constraints, 2-bit quantization offers promising potential for memory reduction and speedup.
 arXiv  Detail & Related papers  (2025-02-04T18:59:26Z)
- MixQuant: Mixed Precision Quantization with a Bit-width Optimization
  Search [7.564770908909927]
 Quantization is a technique for creating efficient Deep Neural Networks (DNNs)
We propose MixQuant, a search algorithm that finds the optimal custom quantization bit-width for each layer weight based on roundoff error.
We show that combining MixQuant with BRECQ, a state-of-the-art quantization method, yields better quantized model accuracy than BRECQ alone.
 arXiv  Detail & Related papers  (2023-09-29T15:49:54Z)
- Quantum Sparse Coding [5.130440339897477]
 We develop a quantum-inspired algorithm for sparse coding.
The emergence of quantum computers and Ising machines can potentially lead to more accurate estimations.
We conduct numerical experiments with simulated data on Lightr's quantum-inspired digital platform.
 arXiv  Detail & Related papers  (2022-09-08T13:00:30Z)
- Post-training Quantization for Neural Networks with Provable Guarantees [9.58246628652846]
 We modify a post-training neural-network quantization method, GPFQ, that is based on a greedy path-following mechanism.
We prove that for quantizing a single-layer network, the relative square error essentially decays linearly in the number of weights.
 arXiv  Detail & Related papers  (2022-01-26T18:47:38Z)
- OMPQ: Orthogonal Mixed Precision Quantization [64.59700856607017]
 Mixed precision quantization takes advantage of hardware's multiple bit-width arithmetic operations to unleash the full potential of network quantization.
We propose to optimize a proxy metric, the concept of networkity, which is highly correlated with the loss of the integer programming.
This approach reduces the search time and required data amount by orders of magnitude, with little compromise on quantization accuracy.
 arXiv  Detail & Related papers  (2021-09-16T10:59:33Z)
- Cluster-Promoting Quantization with Bit-Drop for Minimizing Network
  Quantization Loss [61.26793005355441]
 Cluster-Promoting Quantization (CPQ) finds the optimal quantization grids for neural networks.
DropBits is a new bit-drop technique that revises the standard dropout regularization to randomly drop bits instead of neurons.
We experimentally validate our method on various benchmark datasets and network architectures.
 arXiv  Detail & Related papers  (2021-09-05T15:15:07Z)
- n-hot: Efficient bit-level sparsity for powers-of-two neural network
  quantization [0.0]
 Powers-of-two (PoT) quantization reduces the number of bit operations of deep neural networks on resource-constrained hardware.
PoT quantization triggers a severe accuracy drop because of its limited representation ability.
We propose an efficient PoT quantization scheme that balances accuracy and costs in a memory-efficient way.
 arXiv  Detail & Related papers  (2021-03-22T10:13:12Z)
- DAQ: Distribution-Aware Quantization for Deep Image Super-Resolution
  Networks [49.191062785007006]
 Quantizing deep convolutional neural networks for image super-resolution substantially reduces their computational costs.
Existing works either suffer from a severe performance drop in ultra-low precision of 4 or lower bit-widths, or require a heavy fine-tuning process to recover the performance.
We propose a novel distribution-aware quantization scheme (DAQ) which facilitates accurate training-free quantization in ultra-low precision.
 arXiv  Detail & Related papers  (2020-12-21T10:19:42Z)
- Searching for Low-Bit Weights in Quantized Neural Networks [129.8319019563356]
 Quantized neural networks with low-bit weights and activations are attractive for developing AI accelerators.
We present to regard the discrete weights in an arbitrary quantized neural network as searchable variables, and utilize a differential method to search them accurately.
 arXiv  Detail & Related papers  (2020-09-18T09:13:26Z)
- Optimal Quantization for Batch Normalization in Neural Network
  Deployments and Beyond [18.14282813812512]
 Batch Normalization (BN) poses a challenge for Quantized Neural Networks (QNNs)
We propose a novel method to quantize BN by converting an affine transformation of two floating points to a fixed-point operation with shared quantized scale.
Our method is verified by experiments at layer level on CIFAR and ImageNet datasets.
 arXiv  Detail & Related papers  (2020-08-30T09:33:29Z)
- Bayesian Bits: Unifying Quantization and Pruning [73.27732135853243]
 We introduce Bayesian Bits, a practical method for joint mixed precision quantization and pruning through gradient based optimization.
We experimentally validate our proposed method on several benchmark datasets and show that we can learn pruned, mixed precision networks.
 arXiv  Detail & Related papers  (2020-05-14T16:00:34Z)
- Widening and Squeezing: Towards Accurate and Efficient QNNs [125.172220129257]
 Quantization neural networks (QNNs) are very attractive to the industry because their extremely cheap calculation and storage overhead, but their performance is still worse than that of networks with full-precision parameters.
Most of existing methods aim to enhance performance of QNNs especially binary neural networks by exploiting more effective training techniques.
We address this problem by projecting features in original full-precision networks to high-dimensional quantization features.
 arXiv  Detail & Related papers  (2020-02-03T04:11:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.