Poster: Self-Supervised Quantization-Aware Knowledge Distillation
- URL: http://arxiv.org/abs/2309.13220v1
- Date: Fri, 22 Sep 2023 23:52:58 GMT
- Title: Poster: Self-Supervised Quantization-Aware Knowledge Distillation
- Authors: Kaiqi Zhao, Ming Zhao
- Abstract summary: Quantization-aware training (QAT) starts with a pre-trained full-precision model and performs quantization during retraining.
Existing QAT works require supervision from the labels and they suffer from accuracy loss due to reduced precision.
This paper proposes a novel Self-Supervised Quantization-Aware Knowledge Distillation framework (SQAKD)
- Score: 6.463799944811755
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Quantization-aware training (QAT) starts with a pre-trained full-precision
model and performs quantization during retraining. However, existing QAT works
require supervision from the labels and they suffer from accuracy loss due to
reduced precision. To address these limitations, this paper proposes a novel
Self-Supervised Quantization-Aware Knowledge Distillation framework (SQAKD).
SQAKD first unifies the forward and backward dynamics of various quantization
functions and then reframes QAT as a co-optimization problem that
simultaneously minimizes the KL-Loss and the discretization error, in a
self-supervised manner. The evaluation shows that SQAKD significantly improves
the performance of various state-of-the-art QAT works. SQAKD establishes
stronger baselines and does not require extensive labeled training data,
potentially making state-of-the-art QAT research more accessible.
Related papers
- Dual Ensembled Multiagent Q-Learning with Hypernet Regularizer [62.01554688056335]
Overestimation in the multiagent setting has received comparatively little attention.
We propose a novel hypernet regularizer on hypernetwork weights and biases to constrain the optimization of online global Q-network to prevent overestimation accumulation.
arXiv Detail & Related papers (2025-02-04T05:14:58Z) - High-Fidelity Coherent-One-Way QKD Simulation Framework for 6G Networks: Bridging Theory and Reality [105.73011353120471]
Quantum key distribution (QKD) has been emerged as a promising solution for guaranteeing information-theoretic security.
Due to the considerable high-cost of QKD equipment, a lack of QKD communication system design tools is challenging.
This paper introduces a QKD communication system design tool.
arXiv Detail & Related papers (2025-01-21T11:03:59Z) - Boosting CLIP Adaptation for Image Quality Assessment via Meta-Prompt Learning and Gradient Regularization [55.09893295671917]
This paper introduces a novel Gradient-Regulated Meta-Prompt IQA Framework (GRMP-IQA)
The GRMP-IQA comprises two key modules: Meta-Prompt Pre-training Module and Quality-Aware Gradient Regularization.
Experiments on five standard BIQA datasets demonstrate the superior performance to the state-of-the-art BIQA methods under limited data setting.
arXiv Detail & Related papers (2024-09-09T07:26:21Z) - Self-Supervised Quantization-Aware Knowledge Distillation [5.4714555711042]
This paper proposes a novel Self-Supervised Quantization-Aware Knowledge Distillation (SQAKD) framework.
SQAKD unifies the forward and backward dynamics of various quantization functions, making it flexible for incorporating various QAT works.
A comprehensive evaluation shows that SQAKD substantially outperforms the state-of-the-art QAT and KD works for a variety of model architectures.
arXiv Detail & Related papers (2024-03-17T06:20:28Z) - In-Distribution Consistency Regularization Improves the Generalization of Quantization-Aware Training [16.475151881506914]
We propose Consistency Regularization (CR) to improve the generalization ability of Quantization-Aware Training (QAT)
Our approach significantly outperforms current state-of-the-art QAT methods and even the FP counterparts.
arXiv Detail & Related papers (2024-02-21T03:19:48Z) - Challenges for Reinforcement Learning in Quantum Circuit Design [8.894627352356302]
Hybrid quantum machine learning (QML) comprises both the application of QC to improve machine learning (ML) and ML to improve QC architectures.
We propose qcd-gym, a concrete framework formalized as a Markov decision process, to enable learning policies capable of controlling a universal set of continuously parameterized quantum gates.
arXiv Detail & Related papers (2023-12-18T16:41:30Z) - Understanding, Predicting and Better Resolving Q-Value Divergence in
Offline-RL [86.0987896274354]
We first identify a fundamental pattern, self-excitation, as the primary cause of Q-value estimation divergence in offline RL.
We then propose a novel Self-Excite Eigenvalue Measure (SEEM) metric to measure the evolving property of Q-network at training.
For the first time, our theory can reliably decide whether the training will diverge at an early stage.
arXiv Detail & Related papers (2023-10-06T17:57:44Z) - RAND: Robustness Aware Norm Decay For Quantized Seq2seq Models [14.07649230604283]
We propose low complexity changes to the quantization aware training (QAT) process to improve model accuracy.
With the improved accuracy, it opens up the possibility to exploit some of the other benefits of noise based QAT.
arXiv Detail & Related papers (2023-05-24T19:45:56Z) - SQ-VAE: Variational Bayes on Discrete Representation with Self-annealed
Stochastic Quantization [13.075574481614478]
One noted issue of vector-quantized variational autoencoder (VQ-VAE) is that the learned discrete representation uses only a fraction of the full capacity of the codebook.
We propose a new training scheme that extends the standard VAE via novel dequantization and quantization.
Our experiments show that SQ-VAE improves codebook utilization without using commons.
arXiv Detail & Related papers (2022-05-16T09:49:37Z) - ProQA: Structural Prompt-based Pre-training for Unified Question
Answering [84.59636806421204]
ProQA is a unified QA paradigm that solves various tasks through a single model.
It concurrently models the knowledge generalization for all QA tasks while keeping the knowledge customization for every specific QA task.
ProQA consistently boosts performance on both full data fine-tuning, few-shot learning, and zero-shot testing scenarios.
arXiv Detail & Related papers (2022-05-09T04:59:26Z) - Quantum circuit architecture search on a superconducting processor [56.04169357427682]
Variational quantum algorithms (VQAs) have shown strong evidences to gain provable computational advantages for diverse fields such as finance, machine learning, and chemistry.
However, the ansatz exploited in modern VQAs is incapable of balancing the tradeoff between expressivity and trainability.
We demonstrate the first proof-of-principle experiment of applying an efficient automatic ansatz design technique to enhance VQAs on an 8-qubit superconducting quantum processor.
arXiv Detail & Related papers (2022-01-04T01:53:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.