Related papers: Data-Free Network Quantization With Adversarial Knowledge Distillation

Data-Free Network Quantization With Adversarial Knowledge Distillation

URL: http://arxiv.org/abs/2005.04136v1
Date: Fri, 8 May 2020 16:24:55 GMT
Title: Data-Free Network Quantization With Adversarial Knowledge Distillation
Authors: Yoojin Choi, Jihwan Choi, Mostafa El-Khamy, Jungwon Lee
Abstract summary: In this paper, we consider data-free network quantization with synthetic data. The synthetic data are generated from a generator, while no data are used in training the generator and in quantization. We show the gain of producing diverse adversarial samples by using multiple generators and multiple students.
Score: 39.92282726292386
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Network quantization is an essential procedure in deep learning for development of efficient fixed-point inference models on mobile or edge platforms. However, as datasets grow larger and privacy regulations become stricter, data sharing for model compression gets more difficult and restricted. In this paper, we consider data-free network quantization with synthetic data. The synthetic data are generated from a generator, while no data are used in training the generator and in quantization. To this end, we propose data-free adversarial knowledge distillation, which minimizes the maximum distance between the outputs of the teacher and the (quantized) student for any adversarial samples from a generator. To generate adversarial samples similar to the original data, we additionally propose matching statistics from the batch normalization layers for generated data and the original data in the teacher. Furthermore, we show the gain of producing diverse adversarial samples by using multiple generators and multiple students. Our experiments show the state-of-the-art data-free model compression and quantization results for (wide) residual networks and MobileNet on SVHN, CIFAR-10, CIFAR-100, and Tiny-ImageNet datasets. The accuracy losses compared to using the original datasets are shown to be very minimal.

Related papers

Learning Privacy-Preserving Student Networks via Discriminative-Generative Distillation [24.868697898254368]
Deep models may pose a privacy leakage risk in practical deployment. We propose a discriminative-generative distillation approach to learn privacy-preserving deep models. Our approach can control query cost over private data and accuracy degradation in a unified manner.
arXiv Detail & Related papers (2024-09-04T03:06:13Z)
Dataset Quantization [72.61936019738076]
We present dataset quantization (DQ), a new framework to compress large-scale datasets into small subsets. DQ is the first method that can successfully distill large-scale datasets such as ImageNet-1k with a state-of-the-art compression ratio.
arXiv Detail & Related papers (2023-08-21T07:24:29Z)
Synthetic data, real errors: how (not) to publish and use synthetic data [86.65594304109567]
We show how the generative process affects the downstream ML task. We introduce Deep Generative Ensemble (DGE) to approximate the posterior distribution over the generative process model parameters.
arXiv Detail & Related papers (2023-05-16T07:30:29Z)
Post-training Model Quantization Using GANs for Synthetic Data Generation [57.40733249681334]
We investigate the use of synthetic data as a substitute for the calibration with real data for the quantization method. We compare the performance of models quantized using data generated by StyleGAN2-ADA and our pre-trained DiStyleGAN, with quantization using real data and an alternative data generation method based on fractal images.
arXiv Detail & Related papers (2023-05-10T11:10:09Z)
ScoreMix: A Scalable Augmentation Strategy for Training GANs with Limited Data [93.06336507035486]
Generative Adversarial Networks (GANs) typically suffer from overfitting when limited training data is available. We present ScoreMix, a novel and scalable data augmentation approach for various image synthesis tasks.
arXiv Detail & Related papers (2022-10-27T02:55:15Z)
FairGen: Fair Synthetic Data Generation [0.3149883354098941]
We propose a pipeline to generate fairer synthetic data independent of the GAN architecture. We claim that while generating synthetic data most GANs amplify bias present in the training data but by removing these bias inducing samples, GANs essentially focuses more on real informative samples.
arXiv Detail & Related papers (2022-10-24T08:13:47Z)
Preventing Catastrophic Forgetting and Distribution Mismatch in Knowledge Distillation via Synthetic Data [5.064036314529226]
We propose a data-free KD framework that maintains a dynamic collection of generated samples over time. Our experiments demonstrate that we can improve the accuracy of the student models obtained via KD when compared with state-of-the-art approaches.
arXiv Detail & Related papers (2021-08-11T08:11:08Z)
Lessons Learned from the Training of GANs on Artificial Datasets [0.0]
Generative Adversarial Networks (GANs) have made great progress in synthesizing realistic images in recent years. GANs are prone to underfitting or overfitting, making the analysis of them difficult and constrained. We train them on artificial datasets where there are infinitely many samples and the real data distributions are simple. We find that training mixtures of GANs leads to more performance gain compared to increasing the network depth or width.
arXiv Detail & Related papers (2020-07-13T14:51:02Z)
XOR Mixup: Privacy-Preserving Data Augmentation for One-Shot Federated Learning [49.130350799077114]
We develop a privacy-preserving XOR based mixup data augmentation technique, coined XorMixup. The core idea is to collect other devices' encoded data samples that are decoded only using each device's own data samples. XorMixFL achieves up to 17.6% higher accuracy than Vanilla FL under a non-IID MNIST dataset.
arXiv Detail & Related papers (2020-06-09T09:43:41Z)
Generative Low-bitwidth Data Free Quantization [44.613912463011545]
We propose Generative Low-bitwidth Data Free Quantization (GDFQ) to remove the data dependence burden. With the help of generated data, we can quantize a model by learning knowledge from the pre-trained model. Our method achieves much higher accuracy on 4-bit quantization than the existing data free quantization method.
arXiv Detail & Related papers (2020-03-07T16:38:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.