Data Quality-aware Mixed-precision Quantization via Hybrid Reinforcement
Learning
- URL: http://arxiv.org/abs/2302.04453v1
- Date: Thu, 9 Feb 2023 06:14:00 GMT
- Title: Data Quality-aware Mixed-precision Quantization via Hybrid Reinforcement
Learning
- Authors: Yingchun Wang and Jingcai Guo and Song Guo and Weizhan Zhang
- Abstract summary: Mixed-precision quantization mostly predetermines the model bit-width settings before actual training.
We propose a novel Data Quality-aware Mixed-precision Quantization framework, dubbed DQMQ, to dynamically adapt quantization bit-widths to different data qualities.
- Score: 22.31766292657812
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Mixed-precision quantization mostly predetermines the model bit-width
settings before actual training due to the non-differential bit-width sampling
process, obtaining sub-optimal performance. Worse still, the conventional
static quality-consistent training setting, i.e., all data is assumed to be of
the same quality across training and inference, overlooks data quality changes
in real-world applications which may lead to poor robustness of the quantized
models. In this paper, we propose a novel Data Quality-aware Mixed-precision
Quantization framework, dubbed DQMQ, to dynamically adapt quantization
bit-widths to different data qualities. The adaption is based on a bit-width
decision policy that can be learned jointly with the quantization training.
Concretely, DQMQ is modeled as a hybrid reinforcement learning (RL) task that
combines model-based policy optimization with supervised quantization training.
By relaxing the discrete bit-width sampling to a continuous probability
distribution that is encoded with few learnable parameters, DQMQ is
differentiable and can be directly optimized end-to-end with a hybrid
optimization target considering both task performance and quantization
benefits. Trained on mixed-quality image datasets, DQMQ can implicitly select
the most proper bit-width for each layer when facing uneven input qualities.
Extensive experiments on various benchmark datasets and networks demonstrate
the superiority of DQMQ against existing fixed/mixed-precision quantization
methods.
Related papers
- MetaAug: Meta-Data Augmentation for Post-Training Quantization [32.02377559968568]
Post-Training Quantization (PTQ) has received significant attention because it requires only a small set of calibration data to quantize a full-precision model.
We propose a novel meta-learning based approach to enhance the performance of post-training quantization.
arXiv Detail & Related papers (2024-07-20T02:18:51Z) - Contrastive Pre-Training with Multi-View Fusion for No-Reference Point Cloud Quality Assessment [49.36799270585947]
No-reference point cloud quality assessment (NR-PCQA) aims to automatically evaluate the perceptual quality of distorted point clouds without available reference.
We propose a novel contrastive pre-training framework tailored for PCQA (CoPA)
Our method outperforms the state-of-the-art PCQA methods on popular benchmarks.
arXiv Detail & Related papers (2024-03-15T07:16:07Z) - Mixed-Precision Quantization for Federated Learning on
Resource-Constrained Heterogeneous Devices [17.56259695496955]
We present a novel FL algorithm, FedMPQ, which introduces mixed-precision quantization to resource-heterogeneous FL systems.
Specifically, local models, quantized so as to satisfy bit-width constraint, are trained by optimizing an objective function.
To initialize the next round of local training, the server relies on the information learned in the previous training round to customize bit-width assignments of the models delivered to different clients.
arXiv Detail & Related papers (2023-11-29T22:43:40Z) - CSQ: Growing Mixed-Precision Quantization Scheme with Bi-level
Continuous Sparsification [51.81850995661478]
Mixed-precision quantization has been widely applied on deep neural networks (DNNs)
Previous attempts on bit-level regularization and pruning-based dynamic precision adjustment during training suffer from noisy gradients and unstable convergence.
We propose Continuous Sparsification Quantization (CSQ), a bit-level training method to search for mixed-precision quantization schemes with improved stability.
arXiv Detail & Related papers (2022-12-06T05:44:21Z) - SDQ: Stochastic Differentiable Quantization with Mixed Precision [46.232003346732064]
We present a novel Differentiable Quantization (SDQ) method that can automatically learn the MPQ strategy.
After the optimal MPQ strategy is acquired, we train our network with entropy-aware bin regularization and knowledge distillation.
SDQ outperforms all state-of-the-art mixed datasets or single precision quantization with a lower bitwidth.
arXiv Detail & Related papers (2022-06-09T12:38:18Z) - ClusterQ: Semantic Feature Distribution Alignment for Data-Free
Quantization [111.12063632743013]
We propose a new and effective data-free quantization method termed ClusterQ.
To obtain high inter-class separability of semantic features, we cluster and align the feature distribution statistics.
We also incorporate the intra-class variance to solve class-wise mode collapse.
arXiv Detail & Related papers (2022-04-30T06:58:56Z) - Generalizable Mixed-Precision Quantization via Attribution Rank
Preservation [90.26603048354575]
We propose a generalizable mixed-precision quantization (GMPQ) method for efficient inference.
Our method obtains competitive accuracy-complexity trade-off compared with the state-of-the-art mixed-precision networks.
arXiv Detail & Related papers (2021-08-05T16:41:57Z) - Task-Specific Normalization for Continual Learning of Blind Image
Quality Models [105.03239956378465]
We present a simple yet effective continual learning method for blind image quality assessment (BIQA)
The key step in our approach is to freeze all convolution filters of a pre-trained deep neural network (DNN) for an explicit promise of stability.
We assign each new IQA dataset (i.e., task) a prediction head, and load the corresponding normalization parameters to produce a quality score.
The final quality estimate is computed by black a weighted summation of predictions from all heads with a lightweight $K$-means gating mechanism.
arXiv Detail & Related papers (2021-07-28T15:21:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.