Related papers: Enhancing Generalization in Data-free Quantization via Mixup-class Prompting

Enhancing Generalization in Data-free Quantization via Mixup-class Prompting

URL: http://arxiv.org/abs/2507.21947v1
Date: Tue, 29 Jul 2025 16:00:20 GMT
Title: Enhancing Generalization in Data-free Quantization via Mixup-class Prompting
Authors: Jiwoong Park, Chaeun Lee, Yongseok Choi, Sein Park, Deokki Hong, Jungwook Choi,
Abstract summary: Post-training quantization (PTQ) improves efficiency but struggles with limited calibration data, especially under privacy constraints.<n>Data-free quantization (DFQ) mitigates this by generating synthetic images using generative models such as generative adversarial networks (GANs) and text-conditioned latent diffusion models (LDMs)<n>We propose textbfmixup-class prompt, a mixup-based text prompting strategy that fuses multiple class labels at the text prompt level to generate diverse, robust synthetic data.
Score: 8.107092196905157
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Post-training quantization (PTQ) improves efficiency but struggles with limited calibration data, especially under privacy constraints. Data-free quantization (DFQ) mitigates this by generating synthetic images using generative models such as generative adversarial networks (GANs) and text-conditioned latent diffusion models (LDMs), while applying existing PTQ algorithms. However, the relationship between generated synthetic images and the generalizability of the quantized model during PTQ remains underexplored. Without investigating this relationship, synthetic images generated by previous prompt engineering methods based on single-class prompts suffer from issues such as polysemy, leading to performance degradation. We propose \textbf{mixup-class prompt}, a mixup-based text prompting strategy that fuses multiple class labels at the text prompt level to generate diverse, robust synthetic data. This approach enhances generalization, and improves optimization stability in PTQ. We provide quantitative insights through gradient norm and generalization error analysis. Experiments on convolutional neural networks (CNNs) and vision transformers (ViTs) show that our method consistently outperforms state-of-the-art DFQ methods like GenQ. Furthermore, it pushes the performance boundary in extremely low-bit scenarios, achieving new state-of-the-art accuracy in challenging 2-bit weight, 4-bit activation (W2A4) quantization.

Related papers

QuantVSR: Low-Bit Post-Training Quantization for Real-World Video Super-Resolution [53.13952833016505]
We propose a low-bit quantization model for real-world video super-resolution (VSR)<n>We use a calibration dataset to measure both spatial and temporal complexity for each layer.<n>We refine the FP and low-bit branches to achieve simultaneous optimization.
arXiv Detail & Related papers (2025-08-06T14:35:59Z)
DFQ-ViT: Data-Free Quantization for Vision Transformers without Fine-tuning [9.221916791064407]
Data-Free Quantization (DFQ) enables the quantization of Vision Transformers (ViTs) without requiring access to data, allowing for the deployment of ViTs on devices with limited resources.<n>Existing methods fail to fully capture and balance the global and local features within the samples, resulting in limited synthetic data quality.<n>We propose a pipeline for Data-Free Quantization for Vision Transformers (DFQ-ViT)
arXiv Detail & Related papers (2025-07-19T04:32:04Z)
FIMA-Q: Post-Training Quantization for Vision Transformers by Fisher Information Matrix Approximation [55.12070409045766]
Post-training quantization (PTQ) has stood out as a cost-effective and promising model compression paradigm in recent years.<n>Current PTQ methods for Vision Transformers (ViTs) still suffer from significant accuracy degradation, especially under low-bit quantization.
arXiv Detail & Related papers (2025-06-13T07:57:38Z)
Post-Training Quantization for Video Matting [20.558324038808664]
Video matting is crucial for applications such as film production and virtual reality.<n>Post-Training Quantization (PTQ) is still in its nascent stages for video matting.<n>This paper proposes a novel and general PTQ framework specifically designed for video matting models.
arXiv Detail & Related papers (2025-06-12T15:57:14Z)
MathPhys-Guided Coarse-to-Fine Anomaly Synthesis with SQE-Driven Bi-Level Optimization for Anomaly Detection [30.77558600436759]
We introduce a novel and lightweight pipeline that generates synthetic anomalies through Math-Phys model guidance.<n>Our method produces realistic defect masks, which are subsequently enhanced in two phases.<n>To validate our method, we conduct experiments on three anomaly detection benchmarks: MVTec AD, VisA, and BTAD.
arXiv Detail & Related papers (2025-04-17T14:22:27Z)
APHQ-ViT: Post-Training Quantization with Average Perturbation Hessian Based Reconstruction for Vision Transformers [71.2294205496784]
We propose textbfAPHQ-ViT, a novel PTQ approach based on importance estimation with Average Perturbation Hessian (APH)<n>We show that APHQ-ViT using linear quantizers outperforms existing PTQ methods by substantial margins in 3-bit and 4-bit across different vision tasks.
arXiv Detail & Related papers (2025-04-03T11:48:56Z)
Semantic Alignment and Reinforcement for Data-Free Quantization of Vision Transformers [58.14748181398049]
Data-free quantization (DFQ) enables model quantization without accessing real data, addressing concerns regarding data security and privacy.<n>With the growing adoption of Vision Transformers (ViTs), DFQ for ViTs has garnered significant attention.<n>We propose SARDFQ, a novel Semantics Alignment and Reinforcement Data-Free Quantization method for ViTs.
arXiv Detail & Related papers (2024-12-21T09:30:45Z)
Pushing the Limits of Large Language Model Quantization via the Linearity Theorem [71.3332971315821]
We present a "line theoremarity" establishing a direct relationship between the layer-wise $ell$ reconstruction error and the model perplexity increase due to quantization. This insight enables two novel applications: (1) a simple data-free LLM quantization method using Hadamard rotations and MSE-optimal grids, dubbed HIGGS, and (2) an optimal solution to the problem of finding non-uniform per-layer quantization levels.
arXiv Detail & Related papers (2024-11-26T15:35:44Z)
PTQ4ADM: Post-Training Quantization for Efficient Text Conditional Audio Diffusion Models [8.99127212785609]
This work introduces PTQ4ADM, a novel framework for quantizing audio diffusion models (ADMs) Our key contributions include (1) a coverage-driven prompt augmentation method and (2) an activation-aware calibration set generation algorithm for text-conditional ADMs. Extensive experiments demonstrate PTQ4ADM's capability to reduce the model size by up to 70% while achieving synthesis quality metrics comparable to full-precision models.
arXiv Detail & Related papers (2024-09-20T20:52:56Z)
EDA-DM: Enhanced Distribution Alignment for Post-Training Quantization of Diffusion Models [8.742501879586309]
Quantization can effectively reduce model complexity, and post-training quantization (PTQ) is highly promising for compressing and accelerating diffusion models.<n>Existing PTQ methods suffer from distribution mismatch issues at both calibration sample level and reconstruction output level.<n>We propose EDA-DM, a standardized PTQ method that efficiently addresses the above issues.
arXiv Detail & Related papers (2024-01-09T14:42:49Z)
ClusterQ: Semantic Feature Distribution Alignment for Data-Free Quantization [111.12063632743013]
We propose a new and effective data-free quantization method termed ClusterQ. To obtain high inter-class separability of semantic features, we cluster and align the feature distribution statistics. We also incorporate the intra-class variance to solve class-wise mode collapse.
arXiv Detail & Related papers (2022-04-30T06:58:56Z)
Sequence-Level Mixed Sample Data Augmentation [119.94667752029143]
This work proposes a simple data augmentation approach to encourage compositional behavior in neural models for sequence-to-sequence problems. Our approach, SeqMix, creates new synthetic examples by softly combining input/output sequences from the training set.
arXiv Detail & Related papers (2020-11-18T02:18:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.