Related papers: TTAQ: Towards Stable Post-training Quantization in Continuous Domain Adaptation

TTAQ: Towards Stable Post-training Quantization in Continuous Domain Adaptation

URL: http://arxiv.org/abs/2412.09899v1
Date: Fri, 13 Dec 2024 06:34:59 GMT
Title: TTAQ: Towards Stable Post-training Quantization in Continuous Domain Adaptation
Authors: Junrui Xiao, Zhikai Li, Lianwei Yang, Yiduo Mei, Qingyi Gu,
Abstract summary: Post-training quantization (PTQ) reduces excessive hardware cost by quantizing full-precision models into lower bit representations on a tiny calibration set.<n>Traditional PTQ methods typically encounter failure in dynamic and ever-changing real-world scenarios.<n>We propose a novel and stable quantization process for test-time adaptation (TTA), dubbed TTAQ, to address the performance degradation of traditional PTQ.
Score: 3.7024647541541014
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Post-training quantization (PTQ) reduces excessive hardware cost by quantizing full-precision models into lower bit representations on a tiny calibration set, without retraining. Despite the remarkable progress made through recent efforts, traditional PTQ methods typically encounter failure in dynamic and ever-changing real-world scenarios, involving unpredictable data streams and continual domain shifts, which poses greater challenges. In this paper, we propose a novel and stable quantization process for test-time adaptation (TTA), dubbed TTAQ, to address the performance degradation of traditional PTQ in dynamically evolving test domains. To tackle domain shifts in quantizer, TTAQ proposes the Perturbation Error Mitigation (PEM) and Perturbation Consistency Reconstruction (PCR). Specifically, PEM analyzes the error propagation and devises a weight regularization scheme to mitigate the impact of input perturbations. On the other hand, PCR introduces consistency learning to ensure that quantized models provide stable predictions for same sample. Furthermore, we introduce Adaptive Balanced Loss (ABL) to adjust the logits by taking advantage of the frequency and complexity of the class, which can effectively address the class imbalance caused by unpredictable data streams during optimization. Extensive experiments are conducted on multiple datasets with generic TTA methods, proving that TTAQ can outperform existing baselines and encouragingly improve the accuracy of low bit PTQ models in continually changing test domains. For instance, TTAQ decreases the mean error of 2-bit models on ImageNet-C dataset by an impressive 10.1\%.

Related papers

LRQ-DiT: Log-Rotation Post-Training Quantization of Diffusion Transformers for Text-to-Image Generation [34.14174796390669]
Post-training quantization (PTQ) is a promising solution to reduce memory usage and accelerate inference.<n>Existing PTQ methods suffer from severe performance degradation under extreme low-bit settings.<n>We propose LRQ-DiT, an efficient and accurate PTQ framework.
arXiv Detail & Related papers (2025-08-05T14:16:11Z)
FIMA-Q: Post-Training Quantization for Vision Transformers by Fisher Information Matrix Approximation [55.12070409045766]
Post-training quantization (PTQ) has stood out as a cost-effective and promising model compression paradigm in recent years.<n>Current PTQ methods for Vision Transformers (ViTs) still suffer from significant accuracy degradation, especially under low-bit quantization.
arXiv Detail & Related papers (2025-06-13T07:57:38Z)
ReservoirTTA: Prolonged Test-time Adaptation for Evolving and Recurring Domains [17.357842682605185]
ReservoirTTA is a novel plug-in framework designed for prolonged test-time adaptation.<n>At its core, ReservoirTTA maintains a reservoir of domain-specialized models.<n>Our theoretical analysis reveals key components that bound parameter variance and prevent model collapse.
arXiv Detail & Related papers (2025-05-20T15:39:20Z)
Quantization Error Propagation: Revisiting Layer-Wise Post-Training Quantization [0.0]
Post-training quantization has emerged as a widely used technique for compressing large language models (LLMs) without retraining. The accumulation of quantization errors across layers significantly degrades performance, particularly in low-bit regimes. We propose Quantization Error propagation (QEP), a lightweight and general framework that enhances layer-wise PTQ by explicitly propagating the quantization error.
arXiv Detail & Related papers (2025-04-13T15:56:00Z)
APHQ-ViT: Post-Training Quantization with Average Perturbation Hessian Based Reconstruction for Vision Transformers [71.2294205496784]
We propose textbfAPHQ-ViT, a novel PTQ approach based on importance estimation with Average Perturbation Hessian (APH) We show that APHQ-ViT using linear quantizers outperforms existing PTQ methods by substantial margins in 3-bit and 4-bit across different vision tasks.
arXiv Detail & Related papers (2025-04-03T11:48:56Z)
Robust Machine Unlearning for Quantized Neural Networks via Adaptive Gradient Reweighting with Similar Labels [5.868949328814509]
Model quantization enables efficient deployment of deep neural networks on edge devices through low-bit parameter representation. Existing machine unlearning (MU) methods fail to address two fundamental limitations in quantized networks. We propose Q-MUL, the first dedicated unlearning framework for quantized models.
arXiv Detail & Related papers (2025-03-18T05:22:13Z)
Test-Time Model Adaptation with Only Forward Passes [68.11784295706995]
Test-time adaptation has proven effective in adapting a given trained model to unseen test samples with potential distribution shifts. We propose a test-time Forward-Optimization Adaptation (FOA) method. FOA runs on quantized 8-bit ViT, outperforms gradient-based TENT on full-precision 32-bit ViT, and achieves an up to 24-fold memory reduction on ImageNet-C.
arXiv Detail & Related papers (2024-04-02T05:34:33Z)
Uncertainty-Calibrated Test-Time Model Adaptation without Forgetting [55.17761802332469]
Test-time adaptation (TTA) seeks to tackle potential distribution shifts between training and test data by adapting a given model w.r.t. any test sample. Prior methods perform backpropagation for each test sample, resulting in unbearable optimization costs to many applications. We propose an Efficient Anti-Forgetting Test-Time Adaptation (EATA) method which develops an active sample selection criterion to identify reliable and non-redundant samples.
arXiv Detail & Related papers (2024-03-18T05:49:45Z)
Resilient Practical Test-Time Adaptation: Soft Batch Normalization Alignment and Entropy-driven Memory Bank [24.096250529224914]
We propose a practical test-time adaptation (ResiTTA) method focused on parameter resilience and data quality. We use an entropy-driven memory bank that accounts for timeliness, the persistence of over-confident samples, and sample uncertainty for high-quality data in adaptation. We empirically validate ResiTTA across various benchmark datasets, demonstrating state-of-the-art performance.
arXiv Detail & Related papers (2024-01-26T03:24:55Z)
EDA-DM: Enhanced Distribution Alignment for Post-Training Quantization of Diffusion Models [4.21216544443537]
Quantization can effectively reduce model complexity, and post-training quantization (PTQ) is highly promising for compressing and accelerating diffusion models. Existing PTQ methods for diffusion models suffer from distribution mismatch issues at both calibration sample level and reconstruction output level. We propose Enhanced Distribution Alignment for Post-Training Quantization of Diffusion Models (EDA-DM) to address the above issues.
arXiv Detail & Related papers (2024-01-09T14:42:49Z)
Generalized Robust Test-Time Adaptation in Continuous Dynamic Scenarios [18.527640606971563]
Test-time adaptation (TTA) adapts pre-trained models to test distributions during the inference phase exclusively employing unlabeled test data streams. We propose a Generalized Robust Test-Time Adaptation (GRoTTA) method to effectively address the difficult problem.
arXiv Detail & Related papers (2023-10-07T07:13:49Z)
REALM: Robust Entropy Adaptive Loss Minimization for Improved Single-Sample Test-Time Adaptation [5.749155230209001]
Fully-test-time adaptation (F-TTA) can mitigate performance loss due to distribution shifts between train and test data. We present a general framework for improving robustness of F-TTA to noisy samples, inspired by self-paced learning and robust loss functions.
arXiv Detail & Related papers (2023-09-07T18:44:58Z)
Benchmarking the Reliability of Post-training Quantization: a Particular Focus on Worst-case Performance [53.45700148820669]
Post-training quantization (PTQ) is a popular method for compressing deep neural networks (DNNs) without modifying their original architecture or training procedures. Despite its effectiveness and convenience, the reliability of PTQ methods in the presence of some extrem cases such as distribution shift and data noise remains largely unexplored. This paper first investigates this problem on various commonly-used PTQ methods.
arXiv Detail & Related papers (2023-03-23T02:55:50Z)
DELTA: degradation-free fully test-time adaptation [59.74287982885375]
We find that two unfavorable defects are concealed in the prevalent adaptation methodologies like test-time batch normalization (BN) and self-learning. First, we reveal that the normalization statistics in test-time BN are completely affected by the currently received test samples, resulting in inaccurate estimates. Second, we show that during test-time adaptation, the parameter update is biased towards some dominant classes.
arXiv Detail & Related papers (2023-01-30T15:54:00Z)
Efficient Test-Time Model Adaptation without Forgetting [60.36499845014649]
Test-time adaptation seeks to tackle potential distribution shifts between training and testing data. We propose an active sample selection criterion to identify reliable and non-redundant samples. We also introduce a Fisher regularizer to constrain important model parameters from drastic changes.
arXiv Detail & Related papers (2022-04-06T06:39:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.