Related papers: Unified Stochastic Framework for Neural Network Quantization and Pruning

Related papers

Provable Post-Training Quantization: Theoretical Analysis of OPTQ and Qronos [11.469337174377046]
Post-training quantization (PTQ) has become a crucial tool for reducing the memory and compute costs of modern deep neural networks.<n>OPTQ framework-also known as GPTQ-has emerged as a leading method due to its computational efficiency and strong empirical performance.<n>Despite its widespread adoption, OPTQ lacks rigorous quantitative theoretical guarantees.
arXiv Detail & Related papers (2025-08-06T20:00:40Z)
RoSTE: An Efficient Quantization-Aware Supervised Fine-Tuning Approach for Large Language Models [53.571195477043496]
We propose an algorithm named Rotated Straight-Through-Estimator (RoSTE) RoSTE combines quantization-aware supervised fine-tuning (QA-SFT) with an adaptive rotation strategy to reduce activation outliers. Our findings reveal that the prediction error is directly proportional to the quantization error of the converged weights, which can be effectively managed through an optimized rotation configuration.
arXiv Detail & Related papers (2025-02-13T06:44:33Z)
Pushing the Limits of Large Language Model Quantization via the Linearity Theorem [71.3332971315821]
We present a "line theoremarity" establishing a direct relationship between the layer-wise $ell$ reconstruction error and the model perplexity increase due to quantization. This insight enables two novel applications: (1) a simple data-free LLM quantization method using Hadamard rotations and MSE-optimal grids, dubbed HIGGS, and (2) an optimal solution to the problem of finding non-uniform per-layer quantization levels.
arXiv Detail & Related papers (2024-11-26T15:35:44Z)
Robust Training of Neural Networks at Arbitrary Precision and Sparsity [15.121043556313689]
We introduce a denoising dequantization transform derived from a principled ridge regression objective.<n>We extend this principle to sparsification by viewing it as a special form of quantization that maps insignificant values to zero.<n>This approach yields state-of-the-art results and provides a theoretically-grounded path to hyper-efficient neural networks.
arXiv Detail & Related papers (2024-09-14T00:57:32Z)
Robust Stochastically-Descending Unrolled Networks [85.6993263983062]
Deep unrolling is an emerging learning-to-optimize method that unrolls a truncated iterative algorithm in the layers of a trainable neural network.<n>We show that convergence guarantees and generalizability of the unrolled networks are still open theoretical problems.<n>We numerically assess unrolled architectures trained under the proposed constraints in two different applications.
arXiv Detail & Related papers (2023-12-25T18:51:23Z)
CBQ: Cross-Block Quantization for Large Language Models [66.82132832702895]
Post-training quantization (PTQ) has played a key role in compressing large language models (LLMs) with ultra-low costs. We propose CBQ, a cross-block reconstruction-based PTQ method for LLMs. CBQ employs a cross-block dependency using a reconstruction scheme, establishing long-range dependencies across multiple blocks to minimize error accumulation.
arXiv Detail & Related papers (2023-12-13T07:56:27Z)
Quantization Aware Factorization for Deep Neural Network Compression [20.04951101799232]
decomposition of convolutional and fully-connected layers is an effective way to reduce parameters and FLOP in neural networks. A conventional post-training quantization approach applied to networks with weights yields a drop in accuracy. This motivated us to develop an algorithm that finds decomposed approximation directly with quantized factors.
arXiv Detail & Related papers (2023-08-08T21:38:02Z)
Regularized Vector Quantization for Tokenized Image Synthesis [126.96880843754066]
Quantizing images into discrete representations has been a fundamental problem in unified generative modeling. deterministic quantization suffers from severe codebook collapse and misalignment with inference stage while quantization suffers from low codebook utilization and reconstruction objective. This paper presents a regularized vector quantization framework that allows to mitigate perturbed above issues effectively by applying regularization from two perspectives.
arXiv Detail & Related papers (2023-03-11T15:20:54Z)
A simple approach for quantizing neural networks [7.056222499095849]
We propose a new method for quantizing the weights of a fully trained neural network. A simple deterministic pre-processing step allows us to quantize network layers via memoryless scalar quantization. The developed method also readily allows the quantization of deep networks by consecutive application to single layers.
arXiv Detail & Related papers (2022-09-07T22:36:56Z)
Symmetry Regularization and Saturating Nonlinearity for Robust Quantization [5.1779694507922835]
We present three insights to robustify a network against quantization. We propose two novel methods called symmetry regularization (SymReg) and saturating nonlinearity (SatNL) Applying the proposed methods during training can enhance the robustness of arbitrary neural networks against quantization.
arXiv Detail & Related papers (2022-07-31T02:12:28Z)
Post-training Quantization for Neural Networks with Provable Guarantees [9.58246628652846]
We modify a post-training neural-network quantization method, GPFQ, that is based on a greedy path-following mechanism. We prove that for quantizing a single-layer network, the relative square error essentially decays linearly in the number of weights.
arXiv Detail & Related papers (2022-01-26T18:47:38Z)
Cluster-Promoting Quantization with Bit-Drop for Minimizing Network Quantization Loss [61.26793005355441]
Cluster-Promoting Quantization (CPQ) finds the optimal quantization grids for neural networks. DropBits is a new bit-drop technique that revises the standard dropout regularization to randomly drop bits instead of neurons. We experimentally validate our method on various benchmark datasets and network architectures.
arXiv Detail & Related papers (2021-09-05T15:15:07Z)
Sampling asymmetric open quantum systems for artificial neural networks [77.34726150561087]
We present a hybrid sampling strategy which takes asymmetric properties explicitly into account, achieving fast convergence times and high scalability for asymmetric open systems. We highlight the universal applicability of artificial neural networks, underlining the universal applicability of neural networks.
arXiv Detail & Related papers (2020-12-20T18:25:29Z)
QuantNet: Learning to Quantize by Learning within Fully Differentiable Framework [32.465949985191635]
This paper proposes a meta-based quantizer named QuantNet, which utilizes a differentiable sub-network to directly binarize the full-precision weights. Our method not only solves the problem of gradient mismatching, but also reduces the impact of discretization errors, caused by the binarizing operation in the deployment.
arXiv Detail & Related papers (2020-09-10T01:41:05Z)
Gradient $\ell_1$ Regularization for Quantization Robustness [70.39776106458858]
We derive a simple regularization scheme that improves robustness against post-training quantization. By training quantization-ready networks, our approach enables storing a single set of weights that can be quantized on-demand to different bit-widths.
arXiv Detail & Related papers (2020-02-18T12:31:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.