Related papers: Neural Network Activation Quantization with Bitwise Information Bottlenecks

Neural Network Activation Quantization with Bitwise Information Bottlenecks

URL: http://arxiv.org/abs/2006.05210v1
Date: Tue, 9 Jun 2020 12:10:04 GMT
Title: Neural Network Activation Quantization with Bitwise Information Bottlenecks
Authors: Xichuan Zhou, Kui Liu, Cong Shi, Haijun Liu, Ji Liu
Abstract summary: This paper presents a Bitwise Information Bottleneck approach for quantizing and encoding neural network activations. By minimizing the quantization rate-distortion of each layer, the neural network with information bottlenecks achieves the state-of-the-art accuracy with low-precision activation.
Score: 25.319181120172562
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent researches on information bottleneck shed new light on the continuous attempts to open the black box of neural signal encoding. Inspired by the problem of lossy signal compression for wireless communication, this paper presents a Bitwise Information Bottleneck approach for quantizing and encoding neural network activations. Based on the rate-distortion theory, the Bitwise Information Bottleneck attempts to determine the most significant bits in activation representation by assigning and approximating the sparse coefficient associated with each bit. Given the constraint of a limited average code rate, the information bottleneck minimizes the rate-distortion for optimal activation quantization in a flexible layer-by-layer manner. Experiments over ImageNet and other datasets show that, by minimizing the quantization rate-distortion of each layer, the neural network with information bottlenecks achieves the state-of-the-art accuracy with low-precision activation. Meanwhile, by reducing the code rate, the proposed method can improve the memory and computational efficiency by over six times compared with the deep neural network with standard single-precision representation. Codes will be available on GitHub when the paper is accepted \url{https://github.com/BitBottleneck/PublicCode}.

Related papers

Efficient Evaluation of Quantization-Effects in Neural Codecs [4.897318643396687]
Training neural codecs requires techniques to allow a non-zero gradient across the quantizer. This paper proposes an efficient evaluation framework for neural codecs using simulated data. We validate our findings against an internal neural audio gradient and against the state-of-the-art descript-audio-codec.
arXiv Detail & Related papers (2025-02-07T09:11:19Z)
Improving Generalization of Deep Neural Networks by Optimum Shifting [33.092571599896814]
We propose a novel method called emphoptimum shifting, which changes the parameters of a neural network from a sharp minimum to a flatter one. Our method is based on the observation that when the input and output of a neural network are fixed, the matrix multiplications within the network can be treated as systems of under-determined linear equations.
arXiv Detail & Related papers (2024-05-23T02:31:55Z)
Compression with Bayesian Implicit Neural Representations [16.593537431810237]
We propose overfitting variational neural networks to the data and compressing an approximate posterior weight sample using relative entropy coding instead of quantizing and entropy coding it. Experiments show that our method achieves strong performance on image and audio compression while retaining simplicity.
arXiv Detail & Related papers (2023-05-30T16:29:52Z)
Globally Optimal Training of Neural Networks with Threshold Activation Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations. We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z)
NAF: Neural Attenuation Fields for Sparse-View CBCT Reconstruction [79.13750275141139]
This paper proposes a novel and fast self-supervised solution for sparse-view CBCT reconstruction. The desired attenuation coefficients are represented as a continuous function of 3D spatial coordinates, parameterized by a fully-connected deep neural network. A learning-based encoder entailing hash coding is adopted to help the network capture high-frequency details.
arXiv Detail & Related papers (2022-09-29T04:06:00Z)
ZippyPoint: Fast Interest Point Detection, Description, and Matching through Mixed Precision Discretization [71.91942002659795]
We investigate and adapt network quantization techniques to accelerate inference and enable its use on compute limited platforms. ZippyPoint, our efficient quantized network with binary descriptors, improves the network runtime speed, the descriptor matching speed, and the 3D model size. These improvements come at a minor performance degradation as evaluated on the tasks of homography estimation, visual localization, and map-free visual relocalization.
arXiv Detail & Related papers (2022-03-07T18:59:03Z)
SignalNet: A Low Resolution Sinusoid Decomposition and Estimation Network [79.04274563889548]
We propose SignalNet, a neural network architecture that detects the number of sinusoids and estimates their parameters from quantized in-phase and quadrature samples. We introduce a worst-case learning threshold for comparing the results of our network relative to the underlying data distributions. In simulation, we find that our algorithm is always able to surpass the threshold for three-bit data but often cannot exceed the threshold for one-bit data.
arXiv Detail & Related papers (2021-06-10T04:21:20Z)
Data-free mixed-precision quantization using novel sensitivity metric [6.031526641614695]
We propose a novel sensitivity metric that considers the effect of quantization error on task loss and interaction with other layers. Our experiments show that the proposed metric better represents quantization sensitivity, and generated data are more feasible to be applied to mixed-precision quantization.
arXiv Detail & Related papers (2021-03-18T07:23:21Z)
Efficient bit encoding of neural networks for Fock states [77.34726150561087]
The complexity of the neural network scales only with the number of bit-encoded neurons rather than the maximum boson number. In the high occupation regime its information compression efficiency is shown to surpass even maximally optimized density matrix implementations.
arXiv Detail & Related papers (2021-03-15T11:24:40Z)
Direct Quantization for Training Highly Accurate Low Bit-width Deep Neural Networks [73.29587731448345]
This paper proposes two novel techniques to train deep convolutional neural networks with low bit-width weights and activations. First, to obtain low bit-width weights, most existing methods obtain the quantized weights by performing quantization on the full-precision network weights. Second, to obtain low bit-width activations, existing works consider all channels equally.
arXiv Detail & Related papers (2020-12-26T15:21:18Z)
Mixed-Precision Quantized Neural Network with Progressively Decreasing Bitwidth For Image Classification and Object Detection [21.48875255723581]
A mixed-precision quantized neural network with progressively ecreasing bitwidth is proposed to improve the trade-off between accuracy and compression. Experiments on typical network architectures and benchmark datasets demonstrate that the proposed method could achieve better or comparable results.
arXiv Detail & Related papers (2019-12-29T14:11:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.