Push Quantization-Aware Training Toward Full Precision Performances via
Consistency Regularization
- URL: http://arxiv.org/abs/2402.13497v1
- Date: Wed, 21 Feb 2024 03:19:48 GMT
- Title: Push Quantization-Aware Training Toward Full Precision Performances via
Consistency Regularization
- Authors: Junbiao Pang, Tianyang Cai, Baochang Zhang, Jiaqi Wu and Ye Tao
- Abstract summary: Quantization-Aware Training (QAT) methods intensively depend on the complete labeled dataset or knowledge distillation to guarantee the performances toward Full Precision (FP) accuracies.
We present a simple, novel, yet powerful method introducing an Consistency Regularization (CR) for QAT.
Our method generalizes well to different network architectures and various QAT methods.
- Score: 23.085230108628707
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing Quantization-Aware Training (QAT) methods intensively depend on the
complete labeled dataset or knowledge distillation to guarantee the
performances toward Full Precision (FP) accuracies. However, empirical results
show that QAT still has inferior results compared to its FP counterpart. One
question is how to push QAT toward or even surpass FP performances. In this
paper, we address this issue from a new perspective by injecting the vicinal
data distribution information to improve the generalization performances of QAT
effectively. We present a simple, novel, yet powerful method introducing an
Consistency Regularization (CR) for QAT. Concretely, CR assumes that augmented
samples should be consistent in the latent feature space. Our method
generalizes well to different network architectures and various QAT methods.
Extensive experiments demonstrate that our approach significantly outperforms
the current state-of-the-art QAT methods and even FP counterparts.
Related papers
- From Reward Shaping to Q-Shaping: Achieving Unbiased Learning with LLM-Guided Knowledge [0.0]
Q-shaping is an alternative to reward shaping for incorporating domain knowledge to accelerate agent training.
We evaluated Q-shaping across 20 different environments using a large language model (LLM) as the provider.
arXiv Detail & Related papers (2024-10-02T12:10:07Z) - Smart Sampling: Self-Attention and Bootstrapping for Improved Ensembled Q-Learning [0.6963971634605796]
We present a novel method aimed at enhancing the sample efficiency of ensemble Q learning.
Our proposed approach integrates multi-head self-attention into the ensembled Q networks while bootstrapping the state-action pairs ingested by the ensemble.
arXiv Detail & Related papers (2024-05-14T00:57:02Z) - Self-Supervised Quantization-Aware Knowledge Distillation [5.4714555711042]
This paper proposes a novel Self-Supervised Quantization-Aware Knowledge Distillation (SQAKD) framework.
SQAKD unifies the forward and backward dynamics of various quantization functions, making it flexible for incorporating various QAT works.
A comprehensive evaluation shows that SQAKD substantially outperforms the state-of-the-art QAT and KD works for a variety of model architectures.
arXiv Detail & Related papers (2024-03-17T06:20:28Z) - Poster: Self-Supervised Quantization-Aware Knowledge Distillation [6.463799944811755]
Quantization-aware training (QAT) starts with a pre-trained full-precision model and performs quantization during retraining.
Existing QAT works require supervision from the labels and they suffer from accuracy loss due to reduced precision.
This paper proposes a novel Self-Supervised Quantization-Aware Knowledge Distillation framework (SQAKD)
arXiv Detail & Related papers (2023-09-22T23:52:58Z) - Benchmarking the Reliability of Post-training Quantization: a Particular
Focus on Worst-case Performance [53.45700148820669]
Post-training quantization (PTQ) is a popular method for compressing deep neural networks (DNNs) without modifying their original architecture or training procedures.
Despite its effectiveness and convenience, the reliability of PTQ methods in the presence of some extrem cases such as distribution shift and data noise remains largely unexplored.
This paper first investigates this problem on various commonly-used PTQ methods.
arXiv Detail & Related papers (2023-03-23T02:55:50Z) - Optimizing Two-way Partial AUC with an End-to-end Framework [154.47590401735323]
Area Under the ROC Curve (AUC) is a crucial metric for machine learning.
Recent work shows that the TPAUC is essentially inconsistent with the existing Partial AUC metrics.
We present the first trial in this paper to optimize this new metric.
arXiv Detail & Related papers (2022-06-23T12:21:30Z) - Quantum circuit architecture search on a superconducting processor [56.04169357427682]
Variational quantum algorithms (VQAs) have shown strong evidences to gain provable computational advantages for diverse fields such as finance, machine learning, and chemistry.
However, the ansatz exploited in modern VQAs is incapable of balancing the tradeoff between expressivity and trainability.
We demonstrate the first proof-of-principle experiment of applying an efficient automatic ansatz design technique to enhance VQAs on an 8-qubit superconducting quantum processor.
arXiv Detail & Related papers (2022-01-04T01:53:42Z) - QAFactEval: Improved QA-Based Factual Consistency Evaluation for
Summarization [116.56171113972944]
We show that carefully choosing the components of a QA-based metric is critical to performance.
Our solution improves upon the best-performing entailment-based metric and achieves state-of-the-art performance.
arXiv Detail & Related papers (2021-12-16T00:38:35Z) - Aggressive Q-Learning with Ensembles: Achieving Both High Sample
Efficiency and High Asymptotic Performance [12.871109549160389]
We propose a novel model-free algorithm, Aggressive Q-Learning with Ensembles (AQE), which improves the sample-efficiency performance of REDQ and the performance of TQC.
AQE is very simple, requiring neither distributional representation of critics nor target randomization.
arXiv Detail & Related papers (2021-11-17T14:48:52Z) - Learning to Perturb Word Embeddings for Out-of-distribution QA [55.103586220757464]
We propose a simple yet effective DA method based on a noise generator, which learns to perturb the word embedding of the input questions and context without changing their semantics.
We validate the performance of the QA models trained with our word embedding on a single source dataset, on five different target domains.
Notably, the model trained with ours outperforms the model trained with more than 240K artificially generated QA pairs.
arXiv Detail & Related papers (2021-05-06T14:12:26Z) - Cross Learning in Deep Q-Networks [82.20059754270302]
We propose a novel cross Q-learning algorithm, aim at alleviating the well-known overestimation problem in value-based reinforcement learning methods.
Our algorithm builds on double Q-learning, by maintaining a set of parallel models and estimate the Q-value based on a randomly selected network.
arXiv Detail & Related papers (2020-09-29T04:58:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.