Related papers: Data Quality-aware Mixed-precision Quantization via Hybrid Reinforcement Learning

Data Quality-aware Mixed-precision Quantization via Hybrid Reinforcement Learning

URL: http://arxiv.org/abs/2302.04453v1
Date: Thu, 9 Feb 2023 06:14:00 GMT
Title: Data Quality-aware Mixed-precision Quantization via Hybrid Reinforcement Learning
Authors: Yingchun Wang and Jingcai Guo and Song Guo and Weizhan Zhang
Abstract summary: Mixed-precision quantization mostly predetermines the model bit-width settings before actual training. We propose a novel Data Quality-aware Mixed-precision Quantization framework, dubbed DQMQ, to dynamically adapt quantization bit-widths to different data qualities.
Score: 22.31766292657812
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Mixed-precision quantization mostly predetermines the model bit-width settings before actual training due to the non-differential bit-width sampling process, obtaining sub-optimal performance. Worse still, the conventional static quality-consistent training setting, i.e., all data is assumed to be of the same quality across training and inference, overlooks data quality changes in real-world applications which may lead to poor robustness of the quantized models. In this paper, we propose a novel Data Quality-aware Mixed-precision Quantization framework, dubbed DQMQ, to dynamically adapt quantization bit-widths to different data qualities. The adaption is based on a bit-width decision policy that can be learned jointly with the quantization training. Concretely, DQMQ is modeled as a hybrid reinforcement learning (RL) task that combines model-based policy optimization with supervised quantization training. By relaxing the discrete bit-width sampling to a continuous probability distribution that is encoded with few learnable parameters, DQMQ is differentiable and can be directly optimized end-to-end with a hybrid optimization target considering both task performance and quantization benefits. Trained on mixed-quality image datasets, DQMQ can implicitly select the most proper bit-width for each layer when facing uneven input qualities. Extensive experiments on various benchmark datasets and networks demonstrate the superiority of DQMQ against existing fixed/mixed-precision quantization methods.

Related papers

Exploring Instruction Data Quality for Explainable Image Quality Assessment [58.345719195248314]
We investigate the role of data quality of instruction tuning dataset for explainable IQA.<n>We find that selecting a subset of the data set randomly can even lead to better results than training with the entire instruction tuning dataset.<n>We propose a clustering-based data selection framework with three stages: clustering feature extraction, cluster quota allocation, and cluster sampling strategy.
arXiv Detail & Related papers (2025-10-04T17:12:54Z)
Beyond Outliers: A Study of Optimizers Under Quantization [82.75879062804955]
We study impact of choice on model robustness under quantization.<n>We evaluate how model performance degrades when trained with different baselines.<n>We derive scaling laws for quantization-aware training under different parameters.
arXiv Detail & Related papers (2025-09-27T21:15:22Z)
MoQE: Improve Quantization Model performance via Mixture of Quantization Experts [5.990018519616728]
Mixture of Quantization Experts( abbr. MoQE) is a quantization inference framework based on the Mixture-of-Experts architecture.<n>MoQE combines multiple quantization variants of one full-precision model as specialized "quantization experts"<n>We show that MoQE achieves performance comparable to SOTA quantization model, without incurring significant increases in inference latency.
arXiv Detail & Related papers (2025-08-09T05:58:29Z)
ToReMi: Topic-Aware Data Reweighting for Dynamic Pre-Training Data Selection [28.75333303894706]
ToReMi is a novel framework that adjusts training sample weights according to their topical associations and observed learning patterns. Our experiments reveal that ToReMi variants consistently achieve superior performance over conventional pre-training approaches.
arXiv Detail & Related papers (2025-04-01T12:06:42Z)
MetaAug: Meta-Data Augmentation for Post-Training Quantization [32.02377559968568]
Post-Training Quantization (PTQ) has received significant attention because it requires only a small set of calibration data to quantize a full-precision model. We propose a novel meta-learning based approach to enhance the performance of post-training quantization.
arXiv Detail & Related papers (2024-07-20T02:18:51Z)
Contrastive Pre-Training with Multi-View Fusion for No-Reference Point Cloud Quality Assessment [49.36799270585947]
No-reference point cloud quality assessment (NR-PCQA) aims to automatically evaluate the perceptual quality of distorted point clouds without available reference. We propose a novel contrastive pre-training framework tailored for PCQA (CoPA) Our method outperforms the state-of-the-art PCQA methods on popular benchmarks.
arXiv Detail & Related papers (2024-03-15T07:16:07Z)
Mixed-Precision Quantization for Federated Learning on Resource-Constrained Heterogeneous Devices [17.56259695496955]
We present a novel FL algorithm, FedMPQ, which introduces mixed-precision quantization to resource-heterogeneous FL systems. Specifically, local models, quantized so as to satisfy bit-width constraint, are trained by optimizing an objective function. To initialize the next round of local training, the server relies on the information learned in the previous training round to customize bit-width assignments of the models delivered to different clients.
arXiv Detail & Related papers (2023-11-29T22:43:40Z)
CSQ: Growing Mixed-Precision Quantization Scheme with Bi-level Continuous Sparsification [51.81850995661478]
Mixed-precision quantization has been widely applied on deep neural networks (DNNs) Previous attempts on bit-level regularization and pruning-based dynamic precision adjustment during training suffer from noisy gradients and unstable convergence. We propose Continuous Sparsification Quantization (CSQ), a bit-level training method to search for mixed-precision quantization schemes with improved stability.
arXiv Detail & Related papers (2022-12-06T05:44:21Z)
SDQ: Stochastic Differentiable Quantization with Mixed Precision [46.232003346732064]
We present a novel Differentiable Quantization (SDQ) method that can automatically learn the MPQ strategy. After the optimal MPQ strategy is acquired, we train our network with entropy-aware bin regularization and knowledge distillation. SDQ outperforms all state-of-the-art mixed datasets or single precision quantization with a lower bitwidth.
arXiv Detail & Related papers (2022-06-09T12:38:18Z)
ClusterQ: Semantic Feature Distribution Alignment for Data-Free Quantization [111.12063632743013]
We propose a new and effective data-free quantization method termed ClusterQ. To obtain high inter-class separability of semantic features, we cluster and align the feature distribution statistics. We also incorporate the intra-class variance to solve class-wise mode collapse.
arXiv Detail & Related papers (2022-04-30T06:58:56Z)
Generalizable Mixed-Precision Quantization via Attribution Rank Preservation [90.26603048354575]
We propose a generalizable mixed-precision quantization (GMPQ) method for efficient inference. Our method obtains competitive accuracy-complexity trade-off compared with the state-of-the-art mixed-precision networks.
arXiv Detail & Related papers (2021-08-05T16:41:57Z)
Task-Specific Normalization for Continual Learning of Blind Image Quality Models [105.03239956378465]
We present a simple yet effective continual learning method for blind image quality assessment (BIQA) The key step in our approach is to freeze all convolution filters of a pre-trained deep neural network (DNN) for an explicit promise of stability. We assign each new IQA dataset (i.e., task) a prediction head, and load the corresponding normalization parameters to produce a quality score. The final quality estimate is computed by black a weighted summation of predictions from all heads with a lightweight $K$-means gating mechanism.
arXiv Detail & Related papers (2021-07-28T15:21:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.