Benchmarking the Reliability of Post-training Quantization: a Particular
Focus on Worst-case Performance
- URL: http://arxiv.org/abs/2303.13003v1
- Date: Thu, 23 Mar 2023 02:55:50 GMT
- Title: Benchmarking the Reliability of Post-training Quantization: a Particular
Focus on Worst-case Performance
- Authors: Zhihang Yuan, Jiawei Liu, Jiaxiang Wu, Dawei Yang, Qiang Wu, Guangyu
Sun, Wenyu Liu, Xinggang Wang, Bingzhe Wu
- Abstract summary: Post-training quantization (PTQ) is a popular method for compressing deep neural networks (DNNs) without modifying their original architecture or training procedures.
Despite its effectiveness and convenience, the reliability of PTQ methods in the presence of some extrem cases such as distribution shift and data noise remains largely unexplored.
This paper first investigates this problem on various commonly-used PTQ methods.
- Score: 53.45700148820669
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Post-training quantization (PTQ) is a popular method for compressing deep
neural networks (DNNs) without modifying their original architecture or
training procedures. Despite its effectiveness and convenience, the reliability
of PTQ methods in the presence of some extrem cases such as distribution shift
and data noise remains largely unexplored. This paper first investigates this
problem on various commonly-used PTQ methods. We aim to answer several research
questions related to the influence of calibration set distribution variations,
calibration paradigm selection, and data augmentation or sampling strategies on
PTQ reliability. A systematic evaluation process is conducted across a wide
range of tasks and commonly-used PTQ paradigms. The results show that most
existing PTQ methods are not reliable enough in term of the worst-case group
performance, highlighting the need for more robust methods. Our findings
provide insights for developing PTQ methods that can effectively handle
distribution shift scenarios and enable the deployment of quantized DNNs in
real-world applications.
Related papers
- Process Reward Model with Q-Value Rankings [18.907163177605607]
Process Reward Modeling (PRM) is critical for complex reasoning and decision-making tasks.
We introduce the Process Q-value Model (PQM), a novel framework that redefines PRM in the context of a Markov Decision Process.
PQM optimize Q-value rankings based on a novel comparative loss function, enhancing the model's ability to capture the intricate dynamics among sequential decisions.
arXiv Detail & Related papers (2024-10-15T05:10:34Z) - Attention-aware Post-training Quantization without Backpropagation [11.096116957844014]
Quantization is a promising solution for deploying large-scale language models on resource-constrained devices.
Existing quantization approaches rely on gradient-based optimization.
We propose a novel PTQ algorithm that considers inter-layer dependencies without relying on backpropagation.
arXiv Detail & Related papers (2024-06-19T11:53:21Z) - Contrastive Pre-Training with Multi-View Fusion for No-Reference Point Cloud Quality Assessment [49.36799270585947]
No-reference point cloud quality assessment (NR-PCQA) aims to automatically evaluate the perceptual quality of distorted point clouds without available reference.
We propose a novel contrastive pre-training framework tailored for PCQA (CoPA)
Our method outperforms the state-of-the-art PCQA methods on popular benchmarks.
arXiv Detail & Related papers (2024-03-15T07:16:07Z) - On Pitfalls of Test-Time Adaptation [82.8392232222119]
Test-Time Adaptation (TTA) has emerged as a promising approach for tackling the robustness challenge under distribution shifts.
We present TTAB, a test-time adaptation benchmark that encompasses ten state-of-the-art algorithms, a diverse array of distribution shifts, and two evaluation protocols.
arXiv Detail & Related papers (2023-06-06T09:35:29Z) - PreQuant: A Task-agnostic Quantization Approach for Pre-trained Language
Models [52.09865918265002]
We propose a novel quantize before fine-tuning'' framework, PreQuant.
PreQuant is compatible with various quantization strategies, with outlier-aware fine-tuning incorporated to correct the induced quantization error.
We demonstrate the effectiveness of PreQuant on the GLUE benchmark using BERT, RoBERTa, and T5.
arXiv Detail & Related papers (2023-05-30T08:41:33Z) - Solving Oscillation Problem in Post-Training Quantization Through a
Theoretical Perspective [74.48124653728422]
Post-training quantization (PTQ) is widely regarded as one of the most efficient compression methods practically.
We argue that an overlooked problem of oscillation is in the PTQ methods.
arXiv Detail & Related papers (2023-03-21T14:52:52Z) - Failure-informed adaptive sampling for PINNs [5.723850818203907]
Physics-informed neural networks (PINNs) have emerged as an effective technique for solving PDEs in a wide range of domains.
Recent research has demonstrated, however, that the performance of PINNs can vary dramatically with different sampling procedures.
We present an adaptive approach termed failure-informed PINNs, which is inspired by the viewpoint of reliability analysis.
arXiv Detail & Related papers (2022-10-01T13:34:41Z) - Parameter-Parallel Distributed Variational Quantum Algorithm [7.255056332088222]
Variational quantum algorithms (VQAs) have emerged as a promising near-term technique to explore practical quantum advantage on noisy devices.
Here, we propose a parameter-parallel distributed variational quantum algorithm (PPD-VQA) to accelerate the training process by parameter-parallel training with multiple quantum processors.
The achieved results suggest that the PPD-VQA could provide a practical solution for coordinating multiple quantum processors to handle large-scale real-word applications.
arXiv Detail & Related papers (2022-07-31T15:09:12Z) - Cross Learning in Deep Q-Networks [82.20059754270302]
We propose a novel cross Q-learning algorithm, aim at alleviating the well-known overestimation problem in value-based reinforcement learning methods.
Our algorithm builds on double Q-learning, by maintaining a set of parallel models and estimate the Q-value based on a randomly selected network.
arXiv Detail & Related papers (2020-09-29T04:58:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.