Related papers: GDNSQ: Gradual Differentiable Noise Scale Quantization for Low-bit Neural Networks

GDNSQ: Gradual Differentiable Noise Scale Quantization for Low-bit Neural Networks

URL: http://arxiv.org/abs/2508.14004v1
Date: Tue, 19 Aug 2025 17:05:26 GMT
Title: GDNSQ: Gradual Differentiable Noise Scale Quantization for Low-bit Neural Networks
Authors: Sergey Salishev, Ian Akhremchik,
Abstract summary: Quantized neural networks can be viewed as a chain of noisy channels, where rounding in each layer reduces capacity as bit-width shrinks.<n>We track capacity dynamics as the average bit-width decreases and identify resulting quantization bottlenecks by casting fine-tuning as a smooth, constrained optimization problem.<n>Our approach employs a fully differentiable Straight-Through Estimator (STE) with learnable bit-width bounds, noise scale and clamp, and enforces a target bit-width via an exterior-point penalty.
Score: 0.0
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Quantized neural networks can be viewed as a chain of noisy channels, where rounding in each layer reduces capacity as bit-width shrinks; the floating-point (FP) checkpoint sets the maximum input rate. We track capacity dynamics as the average bit-width decreases and identify resulting quantization bottlenecks by casting fine-tuning as a smooth, constrained optimization problem. Our approach employs a fully differentiable Straight-Through Estimator (STE) with learnable bit-width, noise scale and clamp bounds, and enforces a target bit-width via an exterior-point penalty; mild metric smoothing (via distillation) stabilizes training. Despite its simplicity, the method attains competitive accuracy down to the extreme W1A1 setting while retaining the efficiency of STE.

Related papers

Efficient Multi-bit Quantization Network Training via Weight Bias Correction and Bit-wise Coreset Sampling [19.052294458935595]
Multi-bit quantization networks enable flexible deployment of deep neural networks by supporting multiple precision levels within a single model.<n>Existing approaches suffer from significant training overhead as full-dataset updates are repeated for each supported bit-width.<n>We propose two techniques that greatly reduce the training overhead without compromising model utility.
arXiv Detail & Related papers (2025-10-23T15:49:02Z)
SeWA: Selective Weight Average via Probabilistic Masking [51.015724517293236]
We show that only a few points are needed to achieve better and faster convergence.<n>We transform the discrete selection problem into a continuous subset optimization framework.<n>We derive the SeWA's stability bounds, which are sharper than that under both convex image checkpoints.
arXiv Detail & Related papers (2025-02-14T12:35:21Z)
Gradient Normalization Provably Benefits Nonconvex SGD under Heavy-Tailed Noise [60.92029979853314]
We investigate the roles of gradient normalization and clipping in ensuring the convergence of Gradient Descent (SGD) under heavy-tailed noise. Our work provides the first theoretical evidence demonstrating the benefits of gradient normalization in SGD under heavy-tailed noise. We introduce an accelerated SGD variant incorporating gradient normalization and clipping, further enhancing convergence rates under heavy-tailed noise.
arXiv Detail & Related papers (2024-10-21T22:40:42Z)
Robust ultra-shallow shadows [0.251657752676152]
We present a robust shadow estimation protocol for wide classes of low-depth measurement circuits.<n>For weakly-correlated local noise, the measurement channel has an efficient matrix-product representation.<n>We show how to estimate this directly from experimental data using tensor-network tools.
arXiv Detail & Related papers (2024-05-09T18:00:09Z)
Sparse is Enough in Fine-tuning Pre-trained Large Language Models [98.46493578509039]
We propose a gradient-based sparse fine-tuning algorithm, named Sparse Increment Fine-Tuning (SIFT) We validate its effectiveness on a range of tasks including the GLUE Benchmark and Instruction-tuning.
arXiv Detail & Related papers (2023-12-19T06:06:30Z)
Green, Quantized Federated Learning over Wireless Networks: An Energy-Efficient Design [68.86220939532373]
The finite precision level is captured through the use of quantized neural networks (QNNs) that quantize weights and activations in fixed-precision format. The proposed FL framework can reduce energy consumption until convergence by up to 70% compared to a baseline FL algorithm.
arXiv Detail & Related papers (2022-07-19T16:37:24Z)
LG-LSQ: Learned Gradient Linear Symmetric Quantization [3.6816597150770387]
Deep neural networks with lower precision weights have advantages in terms of the cost of memory space and accelerator power. The main challenge associated with the quantization algorithm is maintaining accuracy at low bit-widths. We propose learned gradient linear symmetric quantization (LG-LSQ) as a method for quantizing weights and activation functions to low bit-widths.
arXiv Detail & Related papers (2022-02-18T03:38:12Z)
Sharpness-aware Quantization for Deep Neural Networks [45.150346855368]
Sharpness-Aware Quantization (SAQ) is a novel method to explore the effect of Sharpness-Aware Minimization (SAM) on model compression. We show that SAQ improves the generalization performance of the quantized models, yielding the SOTA results in uniform quantization.
arXiv Detail & Related papers (2021-11-24T05:16:41Z)
Differentiable Annealed Importance Sampling and the Perils of Gradient Noise [68.44523807580438]
Annealed importance sampling (AIS) and related algorithms are highly effective tools for marginal likelihood estimation. Differentiability is a desirable property as it would admit the possibility of optimizing marginal likelihood as an objective. We propose a differentiable algorithm by abandoning Metropolis-Hastings steps, which further unlocks mini-batch computation.
arXiv Detail & Related papers (2021-07-21T17:10:14Z)
Network Quantization with Element-wise Gradient Scaling [30.06895253269116]
Network quantization aims at reducing bit-widths of weights and/or activations. Most methods use the straight-through estimator (STE) to train quantized networks. We propose an element-wise gradient scaling (EWGS) to train a quantized network better than the STE.
arXiv Detail & Related papers (2021-04-02T06:34:53Z)
BN-invariant sharpness regularizes the training model to better generalization [72.97766238317081]
We propose a measure of sharpness, BN-Sharpness, which gives consistent value for equivalent networks under BN. We use the BN-sharpness to regularize the training and design an algorithm to minimize the new regularized objective.
arXiv Detail & Related papers (2021-01-08T10:23:24Z)
Direct Quantization for Training Highly Accurate Low Bit-width Deep Neural Networks [73.29587731448345]
This paper proposes two novel techniques to train deep convolutional neural networks with low bit-width weights and activations. First, to obtain low bit-width weights, most existing methods obtain the quantized weights by performing quantization on the full-precision network weights. Second, to obtain low bit-width activations, existing works consider all channels equally.
arXiv Detail & Related papers (2020-12-26T15:21:18Z)
Bayesian Bits: Unifying Quantization and Pruning [73.27732135853243]
We introduce Bayesian Bits, a practical method for joint mixed precision quantization and pruning through gradient based optimization. We experimentally validate our proposed method on several benchmark datasets and show that we can learn pruned, mixed precision networks.
arXiv Detail & Related papers (2020-05-14T16:00:34Z)
WaveQ: Gradient-Based Deep Quantization of Neural Networks through Sinusoidal Adaptive Regularization [8.153944203144988]
We propose a novel sinusoidal regularization, called SINAREQ, for deep quantized training. We show how SINAREQ balance compute efficiency and accuracy, and provide a heterogeneous bitwidth assignment for quantization of a large variety of deep networks.
arXiv Detail & Related papers (2020-02-29T01:19:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.