Zero-shot Adversarial Quantization
- URL: http://arxiv.org/abs/2103.15263v2
- Date: Tue, 30 Mar 2021 14:17:16 GMT
- Title: Zero-shot Adversarial Quantization
- Authors: Yuang Liu, Wei Zhang, Jun Wang
- Abstract summary: We propose a zero-shot adversarial quantization (ZAQ) framework, facilitating effective discrepancy estimation and knowledge transfer.
This is achieved by a novel two-level discrepancy modeling to drive a generator to synthesize informative and diverse data examples.
We conduct extensive experiments on three fundamental vision tasks, demonstrating the superiority of ZAQ over the strong zero-shot baselines.
- Score: 11.722728148523366
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Model quantization is a promising approach to compress deep neural networks
and accelerate inference, making it possible to be deployed on mobile and edge
devices. To retain the high performance of full-precision models, most existing
quantization methods focus on fine-tuning quantized model by assuming training
datasets are accessible. However, this assumption sometimes is not satisfied in
real situations due to data privacy and security issues, thereby making these
quantization methods not applicable. To achieve zero-short model quantization
without accessing training data, a tiny number of quantization methods adopt
either post-training quantization or batch normalization statistics-guided data
generation for fine-tuning. However, both of them inevitably suffer from low
performance, since the former is a little too empirical and lacks training
support for ultra-low precision quantization, while the latter could not fully
restore the peculiarities of original data and is often low efficient for
diverse data generation. To address the above issues, we propose a zero-shot
adversarial quantization (ZAQ) framework, facilitating effective discrepancy
estimation and knowledge transfer from a full-precision model to its quantized
model. This is achieved by a novel two-level discrepancy modeling to drive a
generator to synthesize informative and diverse data examples to optimize the
quantized model in an adversarial learning fashion. We conduct extensive
experiments on three fundamental vision tasks, demonstrating the superiority of
ZAQ over the strong zero-shot baselines and validating the effectiveness of its
main components. Code is available at <https://git.io/Jqc0y>.
Related papers
- MetaAug: Meta-Data Augmentation for Post-Training Quantization [32.02377559968568]
Post-Training Quantization (PTQ) has received significant attention because it requires only a small set of calibration data to quantize a full-precision model.
We propose a novel meta-learning based approach to enhance the performance of post-training quantization.
arXiv Detail & Related papers (2024-07-20T02:18:51Z) - PreQuant: A Task-agnostic Quantization Approach for Pre-trained Language
Models [52.09865918265002]
We propose a novel quantize before fine-tuning'' framework, PreQuant.
PreQuant is compatible with various quantization strategies, with outlier-aware fine-tuning incorporated to correct the induced quantization error.
We demonstrate the effectiveness of PreQuant on the GLUE benchmark using BERT, RoBERTa, and T5.
arXiv Detail & Related papers (2023-05-30T08:41:33Z) - Post-training Model Quantization Using GANs for Synthetic Data
Generation [57.40733249681334]
We investigate the use of synthetic data as a substitute for the calibration with real data for the quantization method.
We compare the performance of models quantized using data generated by StyleGAN2-ADA and our pre-trained DiStyleGAN, with quantization using real data and an alternative data generation method based on fractal images.
arXiv Detail & Related papers (2023-05-10T11:10:09Z) - Post-hoc Uncertainty Learning using a Dirichlet Meta-Model [28.522673618527417]
We propose a novel Bayesian meta-model to augment pre-trained models with better uncertainty quantification abilities.
Our proposed method requires no additional training data and is flexible enough to quantify different uncertainties.
We demonstrate our proposed meta-model approach's flexibility and superior empirical performance on these applications.
arXiv Detail & Related papers (2022-12-14T17:34:11Z) - Vertical Layering of Quantized Neural Networks for Heterogeneous
Inference [57.42762335081385]
We study a new vertical-layered representation of neural network weights for encapsulating all quantized models into a single one.
We can theoretically achieve any precision network for on-demand service while only needing to train and maintain one model.
arXiv Detail & Related papers (2022-12-10T15:57:38Z) - Genie: Show Me the Data for Quantization [2.7286395031146062]
We introduce a post-training quantization scheme for zero-shot quantization that produces high-quality quantized networks within a few hours.
We also propose a post-training quantization algorithm to enhance the performance of quantized models.
arXiv Detail & Related papers (2022-12-09T11:18:40Z) - Mixed-Precision Inference Quantization: Radically Towards Faster
inference speed, Lower Storage requirement, and Lower Loss [4.877532217193618]
Existing quantization techniques rely heavily on experience and "fine-tuning" skills.
This study provides a methodology for acquiring a mixed-precise quantization model with a lower loss than the full precision model.
In particular, we will demonstrate that neural networks with massive identity mappings are resistant to the quantization method.
arXiv Detail & Related papers (2022-07-20T10:55:34Z) - ClusterQ: Semantic Feature Distribution Alignment for Data-Free
Quantization [111.12063632743013]
We propose a new and effective data-free quantization method termed ClusterQ.
To obtain high inter-class separability of semantic features, we cluster and align the feature distribution statistics.
We also incorporate the intra-class variance to solve class-wise mode collapse.
arXiv Detail & Related papers (2022-04-30T06:58:56Z) - Generalization Metrics for Practical Quantum Advantage in Generative
Models [68.8204255655161]
Generative modeling is a widely accepted natural use case for quantum computers.
We construct a simple and unambiguous approach to probe practical quantum advantage for generative modeling by measuring the algorithm's generalization performance.
Our simulation results show that our quantum-inspired models have up to a $68 times$ enhancement in generating unseen unique and valid samples.
arXiv Detail & Related papers (2022-01-21T16:35:35Z) - Contextual Dropout: An Efficient Sample-Dependent Dropout Module [60.63525456640462]
Dropout has been demonstrated as a simple and effective module to regularize the training process of deep neural networks.
We propose contextual dropout with an efficient structural design as a simple and scalable sample-dependent dropout module.
Our experimental results show that the proposed method outperforms baseline methods in terms of both accuracy and quality of uncertainty estimation.
arXiv Detail & Related papers (2021-03-06T19:30:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.