Can Large Language Models Still Explain Themselves? Investigating the Impact of Quantization on Self-Explanations
- URL: http://arxiv.org/abs/2601.00282v1
- Date: Thu, 01 Jan 2026 09:50:01 GMT
- Title: Can Large Language Models Still Explain Themselves? Investigating the Impact of Quantization on Self-Explanations
- Authors: Qianli Wang, Nils Feldhus, Pepa Atanasova, Fedor Splitt, Simon Ostermann, Sebastian Möller, Vera Schmitt,
- Abstract summary: Quantization typically leads to moderate declines in both self-explanations (SEs) and faithfulness.<n>No quantization technique consistently excels across task accuracy, SE quality, and faithfulness.
- Score: 18.22236071202241
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Quantization is widely used to accelerate inference and streamline the deployment of large language models (LLMs), yet its effects on self-explanations (SEs) remain unexplored. SEs, generated by LLMs to justify their own outputs, require reasoning about the model's own decision-making process, a capability that may exhibit particular sensitivity to quantization. As SEs are increasingly relied upon for transparency in high-stakes applications, understanding whether and to what extent quantization degrades SE quality and faithfulness is critical. To address this gap, we examine two types of SEs: natural language explanations (NLEs) and counterfactual examples, generated by LLMs quantized using three common techniques at distinct bit widths. Our findings indicate that quantization typically leads to moderate declines in both SE quality (up to 4.4\%) and faithfulness (up to 2.38\%). The user study further demonstrates that quantization diminishes both the coherence and trustworthiness of SEs (up to 8.5\%). Compared to smaller models, larger models show limited resilience to quantization in terms of SE quality but better maintain faithfulness. Moreover, no quantization technique consistently excels across task accuracy, SE quality, and faithfulness. Given that quantization's impact varies by context, we recommend validating SE quality for specific use cases, especially for NLEs, which show greater sensitivity. Nonetheless, the relatively minor deterioration in SE quality and faithfulness does not undermine quantization's effectiveness as a model compression technique.
Related papers
- Evaluating the Impact of Post-Training Quantization on Reliable VQA with Multimodal LLMs [12.376901102913417]
We study how Post-Training Quantization (PTQ) compression affects both accuracy and reliability in Visual Question Answering (VQA)<n>We adapt the Selector confidence estimator for quantized multimodal settings and test its robustness across various quantization levels and out-of-distribution (OOD) scenarios.
arXiv Detail & Related papers (2026-02-08T20:06:24Z) - Through a Compressed Lens: Investigating the Impact of Quantization on LLM Explainability and Interpretability [48.10089747299802]
Quantization methods are widely used to accelerate inference and streamline the deployment of large language models (LLMs)<n>We conduct experiments using three common quantization techniques at distinct bit widths, in conjunction with two explainability methods, counterfactual examples and natural language explanations, as well as two interpretability approaches, knowledge analysis and latent multi-hop reasoning analysis.<n>Our findings reveal that, depending on the configuration, quantization can significantly impact model explainability and interpretability.
arXiv Detail & Related papers (2025-05-20T06:01:09Z) - Low-bit Model Quantization for Deep Neural Networks: A Survey [123.89598730307208]
This article surveys the recent five-year progress towards low-bit quantization on deep neural networks (DNNs)<n>We discuss and compare the state-of-the-art quantization methods and classify them into 8 main categories and 24 sub-categories according to their core techniques.<n>We shed light on the potential research opportunities in the field of model quantization.
arXiv Detail & Related papers (2025-05-08T13:26:19Z) - Quantization Hurts Reasoning? An Empirical Study on Quantized Reasoning Models [48.98109982725689]
We conduct the first systematic study on quantized reasoning models.<n>Our investigation covers weight, KV cache, and activation quantization using state-of-the-art algorithms at varying bit-widths.<n>We identify model size, model origin, and task difficulty as critical determinants of performance.
arXiv Detail & Related papers (2025-04-07T08:22:45Z) - Quantize What Counts: More for Keys, Less for Values [63.51476878610841]
Large Language Models (LLMs) suffer inference-time memory bottlenecks dominated by the attention Key-Value ( KV) cache.<n>This paper proposes two theorems that anchor mixed-precision KV quantization in the intrinsic geometry of Transformer models.
arXiv Detail & Related papers (2025-02-20T22:24:27Z) - Compensate Quantization Errors+: Quantized Models Are Inquisitive Learners [51.32182730502002]
We introduce Singular-value Diagonal Expansion to refine weight distributions to achieve better quantization alignment.<n>Our plug-and-play weight-quantization methods demonstrate substantial performance improvements over state-of-the-art approaches.
arXiv Detail & Related papers (2024-07-22T09:45:16Z) - When Quantization Affects Confidence of Large Language Models? [4.338589334157708]
We show that GPTQ to 4-bit results in a decrease in confidence regarding true labels, with varying impacts observed among different language models.
We propose an explanation for quantization loss based on confidence levels, indicating that quantization disproportionately affects samples where the full model exhibited low confidence levels in the first place.
arXiv Detail & Related papers (2024-05-01T16:58:28Z) - RobustMQ: Benchmarking Robustness of Quantized Models [54.15661421492865]
Quantization is an essential technique for deploying deep neural networks (DNNs) on devices with limited resources.
We thoroughly evaluated the robustness of quantized models against various noises (adrial attacks, natural corruptions, and systematic noises) on ImageNet.
Our research contributes to advancing the robust quantization of models and their deployment in real-world scenarios.
arXiv Detail & Related papers (2023-08-04T14:37:12Z) - Do Emergent Abilities Exist in Quantized Large Language Models: An
Empirical Study [90.34226812493083]
This work aims to investigate the impact of quantization on emphemergent abilities, which are important characteristics that distinguish LLMs from small language models.
Our empirical experiments show that these emergent abilities still exist in 4-bit quantization models, while 2-bit models encounter severe performance degradation.
To improve the performance of low-bit models, we conduct two special experiments: (1) fine-gained impact analysis that studies which components (or substructures) are more sensitive to quantization, and (2) performance compensation through model fine-tuning.
arXiv Detail & Related papers (2023-07-16T15:11:01Z) - Zero-shot Adversarial Quantization [11.722728148523366]
We propose a zero-shot adversarial quantization (ZAQ) framework, facilitating effective discrepancy estimation and knowledge transfer.
This is achieved by a novel two-level discrepancy modeling to drive a generator to synthesize informative and diverse data examples.
We conduct extensive experiments on three fundamental vision tasks, demonstrating the superiority of ZAQ over the strong zero-shot baselines.
arXiv Detail & Related papers (2021-03-29T01:33:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.