MWQ: Multiscale Wavelet Quantized Neural Networks
- URL: http://arxiv.org/abs/2103.05363v1
- Date: Tue, 9 Mar 2021 11:21:59 GMT
- Title: MWQ: Multiscale Wavelet Quantized Neural Networks
- Authors: Qigong Sun, Yan Ren, Licheng Jiao, Xiufang Li, Fanhua Shang, Fang Liu
- Abstract summary: We propose a novel multiscale wavelet quantization (MWQ) method inspired by the characteristics of images in the frequency domain.
It exploits the multiscale frequency and spatial information to alleviate the information loss caused by quantization in the spatial domain.
Because of the flexibility of MWQ, we demonstrate three applications on the ImageNet and COCO datasets.
- Score: 45.22093693422084
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Model quantization can reduce the model size and computational latency, it
has become an essential technique for the deployment of deep neural networks on
resourceconstrained hardware (e.g., mobile phones and embedded devices). The
existing quantization methods mainly consider the numerical elements of the
weights and activation values, ignoring the relationship between elements. The
decline of representation ability and information loss usually lead to the
performance degradation. Inspired by the characteristics of images in the
frequency domain, we propose a novel multiscale wavelet quantization (MWQ)
method. This method decomposes original data into multiscale frequency
components by wavelet transform, and then quantizes the components of different
scales, respectively. It exploits the multiscale frequency and spatial
information to alleviate the information loss caused by quantization in the
spatial domain. Because of the flexibility of MWQ, we demonstrate three
applications (e.g., model compression, quantized network optimization, and
information enhancement) on the ImageNet and COCO datasets. Experimental
results show that our method has stronger representation ability and can play
an effective role in quantized neural networks.
Related papers
- Frequency Disentangled Features in Neural Image Compression [13.016298207860974]
A neural image compression network is governed by how well the entropy model matches the true distribution of the latent code.
In this paper, we propose a feature-level frequency disentanglement to help the relaxed scalar quantization achieve lower bit rates.
The proposed network not only outperforms hand-engineered codecs, but also neural network-based codecs built on-heavy spatially autoregressive entropy models.
arXiv Detail & Related papers (2023-08-04T14:55:44Z) - Towards Neural Variational Monte Carlo That Scales Linearly with System
Size [67.09349921751341]
Quantum many-body problems are central to demystifying some exotic quantum phenomena, e.g., high-temperature superconductors.
The combination of neural networks (NN) for representing quantum states, and the Variational Monte Carlo (VMC) algorithm, has been shown to be a promising method for solving such problems.
We propose a NN architecture called Vector-Quantized Neural Quantum States (VQ-NQS) that utilizes vector-quantization techniques to leverage redundancies in the local-energy calculations of the VMC algorithm.
arXiv Detail & Related papers (2022-12-21T19:00:04Z) - On Quantizing Implicit Neural Representations [30.257625048084968]
We show that a non-uniform quantization of neural weights can lead to significant improvements.
We demonstrate that it is possible (while memory inefficient) to reconstruct signals using binary neural networks.
arXiv Detail & Related papers (2022-09-01T05:48:37Z) - BiTAT: Neural Network Binarization with Task-dependent Aggregated
Transformation [116.26521375592759]
Quantization aims to transform high-precision weights and activations of a given neural network into low-precision weights/activations for reduced memory usage and computation.
Extreme quantization (1-bit weight/1-bit activations) of compactly-designed backbone architectures results in severe performance degeneration.
This paper proposes a novel Quantization-Aware Training (QAT) method that can effectively alleviate performance degeneration.
arXiv Detail & Related papers (2022-07-04T13:25:49Z) - Haar Wavelet Feature Compression for Quantized Graph Convolutional
Networks [7.734726150561088]
Graph Convolutional Networks (GCNs) are widely used in a variety of applications, and can be seen as an unstructured version of standard Convolutional Neural Networks (CNNs)
As in CNNs, the computational cost of GCNs for large input graphs can be high and inhibit the use of these networks, especially in environments with low computational resources.
We propose to utilize Haar wavelet compression and light quantization to reduce the computations and the bandwidth involved with the network.
arXiv Detail & Related papers (2021-10-10T15:25:37Z) - Variational learning of quantum ground states on spiking neuromorphic
hardware [0.0]
High-dimensional sampling spaces and transient autocorrelations confront neural networks with a challenging computational bottleneck.
Compared to conventional neural networks, physical-model devices offer a fast, efficient and inherently parallel substrate.
We demonstrate the ability of a neuromorphic chip to represent the ground states of quantum spin models by variational energy minimization.
arXiv Detail & Related papers (2021-09-30T14:39:45Z) - Post-Training Quantization for Vision Transformer [85.57953732941101]
We present an effective post-training quantization algorithm for reducing the memory storage and computational costs of vision transformers.
We can obtain an 81.29% top-1 accuracy using DeiT-B model on ImageNet dataset with about 8-bit quantization.
arXiv Detail & Related papers (2021-06-27T06:27:22Z) - Learning Frequency-aware Dynamic Network for Efficient Super-Resolution [56.98668484450857]
This paper explores a novel frequency-aware dynamic network for dividing the input into multiple parts according to its coefficients in the discrete cosine transform (DCT) domain.
In practice, the high-frequency part will be processed using expensive operations and the lower-frequency part is assigned with cheap operations to relieve the computation burden.
Experiments conducted on benchmark SISR models and datasets show that the frequency-aware dynamic network can be employed for various SISR neural architectures.
arXiv Detail & Related papers (2021-03-15T12:54:26Z) - Direct Quantization for Training Highly Accurate Low Bit-width Deep
Neural Networks [73.29587731448345]
This paper proposes two novel techniques to train deep convolutional neural networks with low bit-width weights and activations.
First, to obtain low bit-width weights, most existing methods obtain the quantized weights by performing quantization on the full-precision network weights.
Second, to obtain low bit-width activations, existing works consider all channels equally.
arXiv Detail & Related papers (2020-12-26T15:21:18Z) - A Greedy Algorithm for Quantizing Neural Networks [4.683806391173103]
We propose a new computationally efficient method for quantizing the weights of pre- trained neural networks.
Our method deterministically quantizes layers in an iterative fashion with no complicated re-training required.
arXiv Detail & Related papers (2020-10-29T22:53:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.