Related papers: Data Generation for Hardware-Friendly Post-Training Quantization

Data Generation for Hardware-Friendly Post-Training Quantization

URL: http://arxiv.org/abs/2410.22110v1
Date: Tue, 29 Oct 2024 15:08:50 GMT
Title: Data Generation for Hardware-Friendly Post-Training Quantization
Authors: Lior Dikstein, Ariel Lapid, Arnon Netzer, Hai Victor Habi,
Abstract summary: Zero-shot quantization (ZSQ) using synthetic data is a key approach for post-training quantization (PTQ) under privacy and security constraints. Existing data generation methods often struggle to effectively generate data suitable for hardware-friendly quantization. We propose Data Generation for Hardware-friendly quantization (DGH), a novel method that addresses these gaps.
Score: 3.3998740964877463
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Zero-shot quantization (ZSQ) using synthetic data is a key approach for post-training quantization (PTQ) under privacy and security constraints. However, existing data generation methods often struggle to effectively generate data suitable for hardware-friendly quantization, where all model layers are quantized. We analyze existing data generation methods based on batch normalization (BN) matching and identify several gaps between synthetic and real data: 1) Current generation algorithms do not optimize the entire synthetic dataset simultaneously; 2) Data augmentations applied during training are often overlooked; and 3) A distribution shift occurs in the final model layers due to the absence of BN in those layers. These gaps negatively impact ZSQ performance, particularly in hardware-friendly quantization scenarios. In this work, we propose Data Generation for Hardware-friendly quantization (DGH), a novel method that addresses these gaps. DGH jointly optimizes all generated images, regardless of the image set size or GPU memory constraints. To address data augmentation mismatches, DGH includes a preprocessing stage that mimics the augmentation process and enhances image quality by incorporating natural image priors. Finally, we propose a new distribution-stretching loss that aligns the support of the feature map distribution between real and synthetic data. This loss is applied to the model's output and can be adapted to various tasks. DGH demonstrates significant improvements in quantization performance across multiple tasks, achieving up to a 30% increase in accuracy for hardware-friendly ZSQ in both classification and object detection, often performing on par with real data.

Related papers

Mask Factory: Towards High-quality Synthetic Data Generation for Dichotomous Image Segmentation [70.95380821618711]
Dichotomous Image (DIS) tasks require highly precise annotations. Current generative models and techniques struggle with the issues of scene deviations, noise-induced errors, and limited training sample variability. We introduce a novel approach, which provides a scalable solution for generating diverse and precise datasets.
arXiv Detail & Related papers (2024-12-26T06:37:25Z)
Towards Scalable and Deep Graph Neural Networks via Noise Masking [59.058558158296265]
Graph Neural Networks (GNNs) have achieved remarkable success in many graph mining tasks. scaling them to large graphs is challenging due to the high computational and storage costs. We present random walk with noise masking (RMask), a plug-and-play module compatible with the existing model-simplification works.
arXiv Detail & Related papers (2024-12-19T07:48:14Z)
GenQ: Quantization in Low Data Regimes with Generative Synthetic Data [28.773641633757283]
We introduce GenQ, a novel approach employing an advanced Generative AI model to generate high-resolution synthetic data. In case of limited data availability, the actual data is used to guide the synthetic data generation process. Through rigorous experimentation, GenQ establishes new benchmarks in data-free and data-scarce quantization.
arXiv Detail & Related papers (2023-12-07T23:31:42Z)
SqueezeLLM: Dense-and-Sparse Quantization [80.32162537942138]
Main bottleneck for generative inference with LLMs is memory bandwidth, rather than compute, for single batch inference. We introduce SqueezeLLM, a post-training quantization framework that enables lossless compression to ultra-low precisions of up to 3-bit. Our framework incorporates two novel ideas: (i) sensitivity-based non-uniform quantization, which searches for the optimal bit precision assignment based on second-order information; and (ii) the Dense-and-Sparse decomposition that stores outliers and sensitive weight values in an efficient sparse format.
arXiv Detail & Related papers (2023-06-13T08:57:54Z)
Post-training Model Quantization Using GANs for Synthetic Data Generation [57.40733249681334]
We investigate the use of synthetic data as a substitute for the calibration with real data for the quantization method. We compare the performance of models quantized using data generated by StyleGAN2-ADA and our pre-trained DiStyleGAN, with quantization using real data and an alternative data generation method based on fractal images.
arXiv Detail & Related papers (2023-05-10T11:10:09Z)
LD-GAN: Low-Dimensional Generative Adversarial Network for Spectral Image Generation with Variance Regularization [72.4394510913927]
Deep learning methods are state-of-the-art for spectral image (SI) computational tasks. GANs enable diverse augmentation by learning and sampling from the data distribution. GAN-based SI generation is challenging since the high-dimensionality nature of this kind of data hinders the convergence of the GAN training yielding to suboptimal generation. We propose a statistical regularization to control the low-dimensional representation variance for the autoencoder training and to achieve high diversity of samples generated with the GAN.
arXiv Detail & Related papers (2023-04-29T00:25:02Z)
ScoreMix: A Scalable Augmentation Strategy for Training GANs with Limited Data [93.06336507035486]
Generative Adversarial Networks (GANs) typically suffer from overfitting when limited training data is available. We present ScoreMix, a novel and scalable data augmentation approach for various image synthesis tasks.
arXiv Detail & Related papers (2022-10-27T02:55:15Z)
ClusterQ: Semantic Feature Distribution Alignment for Data-Free Quantization [111.12063632743013]
We propose a new and effective data-free quantization method termed ClusterQ. To obtain high inter-class separability of semantic features, we cluster and align the feature distribution statistics. We also incorporate the intra-class variance to solve class-wise mode collapse.
arXiv Detail & Related papers (2022-04-30T06:58:56Z)
Diverse Sample Generation: Pushing the Limit of Data-free Quantization [85.95032037447454]
This paper presents a generic Diverse Sample Generation scheme for the generative data-free post-training quantization and quantization-aware training. For large-scale image classification tasks, our DSG can consistently outperform existing data-free quantization methods.
arXiv Detail & Related papers (2021-09-01T07:06:44Z)
Zero-shot Adversarial Quantization [11.722728148523366]
We propose a zero-shot adversarial quantization (ZAQ) framework, facilitating effective discrepancy estimation and knowledge transfer. This is achieved by a novel two-level discrepancy modeling to drive a generator to synthesize informative and diverse data examples. We conduct extensive experiments on three fundamental vision tasks, demonstrating the superiority of ZAQ over the strong zero-shot baselines.
arXiv Detail & Related papers (2021-03-29T01:33:34Z)
Diversifying Sample Generation for Accurate Data-Free Quantization [35.38029335993735]
We propose Diverse Sample Generation (DSG) scheme to mitigate the adverse effects caused by homogenization. Our scheme is versatile and even able to be applied to the state-of-the-art post-training quantization method like AdaRound.
arXiv Detail & Related papers (2021-03-01T14:46:02Z)
Generative Zero-shot Network Quantization [41.75769117366117]
Convolutional neural networks are able to learn realistic image priors from numerous training samples in low-level image generation and restoration. We show that, for high-level image recognition tasks, we can further reconstruct "realistic" images of each category by leveraging intrinsic Batch Normalization (BN) statistics without any training data.
arXiv Detail & Related papers (2021-01-21T04:10:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.