Related papers: Assessing the Potential for Catastrophic Failure in Dynamic Post-Training Quantization

Assessing the Potential for Catastrophic Failure in Dynamic Post-Training Quantization

URL: http://arxiv.org/abs/2510.02457v1
Date: Thu, 02 Oct 2025 18:13:06 GMT
Title: Assessing the Potential for Catastrophic Failure in Dynamic Post-Training Quantization
Authors: Logan Frank, Paul Ardis,
Abstract summary: Post-training quantization (PTQ) has emerged as an effective tool for reducing the computational complexity and memory usage of a neural network.<n>There is the potential for drastic performance reduction depending upon the distribution of inputs experienced in inference.
Score: 3.437656066916039
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Post-training quantization (PTQ) has recently emerged as an effective tool for reducing the computational complexity and memory usage of a neural network by representing its weights and activations with lower precision. While this paradigm has shown great success in lowering compute and storage costs, there is the potential for drastic performance reduction depending upon the distribution of inputs experienced in inference. When considering possible deployment in safety-critical environments, it is important to investigate the extent of potential performance reduction, and what characteristics of input distributions may give rise to this reduction. In this work, we explore the idea of extreme failure stemming from dynamic PTQ and formulate a knowledge distillation and reinforcement learning task to learn a network and bit-width policy pair such that catastrophic failure under quantization is analyzed in terms of worst case potential. Our results confirm the existence of this "detrimental" network-policy pair, with several instances demonstrating performance reductions in the range of 10-65% in accuracy, compared to their "robust" counterparts encountering a <2% decrease. From systematic experimentation and analyses, we also provide an initial exploration into points at highest vulnerability. While our results represent an initial step toward understanding failure cases introduced by PTQ, our findings ultimately emphasize the need for caution in real-world deployment scenarios. We hope this work encourages more rigorous examinations of robustness and a greater emphasis on safety considerations for future works within the broader field of deep learning.

Related papers

Explaining How Quantization Disparately Skews a Model [8.210473195536077]
Post Training Quantization (PTQ) is widely adopted due to its high compression capacity and speed with minimal impact on accuracy.<n>We observed that disparate impacts are exacerbated by quantization, especially for minority groups.<n>We explore how the changes in weights and activations induced by quantization cause cascaded impacts in the network, resulting in logits with lower variance, increased loss, and compromised group accuracies.
arXiv Detail & Related papers (2025-09-08T21:04:16Z)
Adversarial Robustness Overestimation and Instability in TRADES [4.063518154926961]
TRADES sometimes yields disproportionately high PGD validation accuracy compared to the AutoAttack testing accuracy in the multiclass classification task. This discrepancy highlights a significant overestimation of robustness for these instances, potentially linked to gradient masking.
arXiv Detail & Related papers (2024-10-10T07:32:40Z)
Investigating the Impact of Quantization on Adversarial Robustness [22.637585106574722]
Quantization is a technique for reducing the bit-width of deep models to improve their runtime performance and storage efficiency. In real-world scenarios, quantized models are often faced with adversarial attacks which cause the model to make incorrect inferences. We conduct a first-time analysis of the impact of the quantization pipeline components that can incorporate robust optimization.
arXiv Detail & Related papers (2024-04-08T16:20:15Z)
On the Dynamics Under the Unhinged Loss and Beyond [104.49565602940699]
We introduce the unhinged loss, a concise loss function, that offers more mathematical opportunities to analyze closed-form dynamics. The unhinged loss allows for considering more practical techniques, such as time-vary learning rates and feature normalization.
arXiv Detail & Related papers (2023-12-13T02:11:07Z)
Understanding and Preventing Capacity Loss in Reinforcement Learning [28.52122927103544]
We identify a mechanism by which non-stationary prediction targets can prevent learning progress in deep RL agents. Capacity loss occurs in a range of RL agents and environments, and is particularly damaging to performance in sparse-reward tasks.
arXiv Detail & Related papers (2022-04-20T15:55:15Z)
Towards Balanced Learning for Instance Recognition [149.76724446376977]
We propose Libra R-CNN, a framework towards balanced learning for instance recognition. It integrates IoU-balanced sampling, balanced feature pyramid, and objective re-weighting, respectively for reducing the imbalance at sample, feature, and objective level.
arXiv Detail & Related papers (2021-08-23T13:40:45Z)
Residual Error: a New Performance Measure for Adversarial Robustness [85.0371352689919]
A major challenge that limits the wide-spread adoption of deep learning has been their fragility to adversarial attacks. This study presents the concept of residual error, a new performance measure for assessing the adversarial robustness of a deep neural network. Experimental results using the case of image classification demonstrate the effectiveness and efficacy of the proposed residual error metric.
arXiv Detail & Related papers (2021-06-18T16:34:23Z)
Federated Learning with Unreliable Clients: Performance Analysis and Mechanism Design [76.29738151117583]
Federated Learning (FL) has become a promising tool for training effective machine learning models among distributed clients. However, low quality models could be uploaded to the aggregator server by unreliable clients, leading to a degradation or even a collapse of training. We model these unreliable behaviors of clients and propose a defensive mechanism to mitigate such a security risk.
arXiv Detail & Related papers (2021-05-10T08:02:27Z)
Performance Analysis of Out-of-Distribution Detection on Various Trained Neural Networks [12.22753756637137]
A common challenge for Deep Neural Networks (DNN) occur when exposed to out-of-distribution samples that are previously unseen. In this paper we analyse two supervisors on two well-known DNNs with varied setups of training. We find that the outlier detection performance improves with the quality of the training procedure.
arXiv Detail & Related papers (2021-03-29T12:52:02Z)
Cross Learning in Deep Q-Networks [82.20059754270302]
We propose a novel cross Q-learning algorithm, aim at alleviating the well-known overestimation problem in value-based reinforcement learning methods. Our algorithm builds on double Q-learning, by maintaining a set of parallel models and estimate the Q-value based on a randomly selected network.
arXiv Detail & Related papers (2020-09-29T04:58:17Z)
Untangling tradeoffs between recurrence and self-attention in neural networks [81.30894993852813]
We present a formal analysis of how self-attention affects gradient propagation in recurrent networks. We prove that it mitigates the problem of vanishing gradients when trying to capture long-term dependencies. We propose a relevancy screening mechanism that allows for a scalable use of sparse self-attention with recurrence.
arXiv Detail & Related papers (2020-06-16T19:24:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.