The Lattice Geometry of Neural Network Quantization -- A Short Equivalence Proof of GPTQ and Babai's algorithm
- URL: http://arxiv.org/abs/2508.01077v1
- Date: Fri, 01 Aug 2025 21:20:58 GMT
- Title: The Lattice Geometry of Neural Network Quantization -- A Short Equivalence Proof of GPTQ and Babai's algorithm
- Authors: Johann Birnick,
- Abstract summary: We show how data-driven quantization of a linear unit in a neural network corresponds to solving the closest vector problem for a certain lattice generated by input data.<n>We prove that the GPTQ algorithm is equivalent to Babai's well-known nearest-plane algorithm.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We explain how data-driven quantization of a linear unit in a neural network corresponds to solving the closest vector problem for a certain lattice generated by input data. We prove that the GPTQ algorithm is equivalent to Babai's well-known nearest-plane algorithm. We furthermore provide geometric intuition for both algorithms. Lastly, we note the consequences of these results, in particular hinting at the possibility for using lattice basis reduction for better quantization.
Related papers
- The Geometry of LLM Quantization: GPTQ as Babai's Nearest Plane Algorithm [52.89358421626026]
GPTQ emerged as one of the standard methods for one-shot post-training quantization at LLM scale.<n>We show that GPTQ is mathematically identical to Babai's nearest plane algorithm for the classical closest vector problem.
arXiv Detail & Related papers (2025-07-24T16:22:18Z) - Quantum Algorithm for Shortest Vector Problems with Folded Spectrum Method [0.0]
We propose an alternative encoding and alternative quantum algorithm to solve the shortest vector problem.
Our study shows wide potential applicability of the SVP in quantum computing frameworks.
arXiv Detail & Related papers (2024-08-28T18:01:22Z) - Neural Lattice Reduction: A Self-Supervised Geometric Deep Learning Approach [12.679411410749521]
We show that it is possible to parametrize the algorithm space for lattice reduction problem with neural networks and find an algorithm without supervised data.<n>We design a deep neural model outputting factorized unimodular matrices and train it in a self-supervised manner by penalizing non-orthogonal lattice bases.<n>We show that this approach yields an algorithm with comparable complexity and performance to the Lenstra-Lenstra-Lov'asz algorithm on a set of benchmarks.
arXiv Detail & Related papers (2023-11-14T13:54:35Z) - Quantum Algorithm for Path-Edge Sampling [0.9990687944474739]
We present a quantum algorithm for sampling an edge on a path between two nodes s and t in an undirected graph given as an adjacency matrix.
We use this path sampling algorithm as a subroutine for st-path finding and st-cut-set finding algorithms in some specific cases.
arXiv Detail & Related papers (2023-03-06T17:45:12Z) - Robust affine point matching via quadratic assignment on Grassmannians [50.366876079978056]
Robust Affine Matching with Grassmannians (RoAM) is a new algorithm to perform affine registration of point clouds.
The algorithm is based on minimizing the Frobenius distance between two elements of the Grassmannian.
arXiv Detail & Related papers (2023-03-05T15:27:24Z) - Generalization Bounds for Data-Driven Numerical Linear Algebra [24.961270871124217]
Data-driven algorithms can adapt their internal structure or parameters to inputs from unknown application-specific distributions, by learning from a training sample of inputs.
Several recent works have applied this approach to problems in numerical linear algebra, obtaining significant empirical gains in performance.
In this work we prove generalization bounds for those algorithms, within the PAC-learning framework for data-driven algorithm selection proposed by Gupta and Roughgarden.
arXiv Detail & Related papers (2022-06-16T02:23:45Z) - Unfolding Projection-free SDP Relaxation of Binary Graph Classifier via
GDPA Linearization [59.87663954467815]
Algorithm unfolding creates an interpretable and parsimonious neural network architecture by implementing each iteration of a model-based algorithm as a neural layer.
In this paper, leveraging a recent linear algebraic theorem called Gershgorin disc perfect alignment (GDPA), we unroll a projection-free algorithm for semi-definite programming relaxation (SDR) of a binary graph.
Experimental results show that our unrolled network outperformed pure model-based graph classifiers, and achieved comparable performance to pure data-driven networks but using far fewer parameters.
arXiv Detail & Related papers (2021-09-10T07:01:15Z) - A Parallelizable Lattice Rescoring Strategy with Neural Language Models [62.20538383769179]
A posterior-based lattice expansion algorithm is proposed for efficient lattice rescoring with neural language models (LMs) for automatic speech recognition.
Experiments on the Switchboard dataset show that the proposed rescoring strategy obtains comparable recognition performance.
The parallel rescoring method offers more flexibility by simplifying the integration of PyTorch-trained neural LMs for lattice rescoring with Kaldi.
arXiv Detail & Related papers (2021-03-08T21:23:12Z) - Nearly Linear Row Sampling Algorithm for Quantile Regression [54.75919082407094]
We give a row sampling algorithm for the quantile loss function with sample complexity nearly linear in the dimensionality of the data.
Based upon our row sampling algorithm, we give the fastest known algorithm for quantile regression and a graph sparsification algorithm for balanced directed graphs.
arXiv Detail & Related papers (2020-06-15T13:40:07Z) - Convex Geometry and Duality of Over-parameterized Neural Networks [70.15611146583068]
We develop a convex analytic approach to analyze finite width two-layer ReLU networks.
We show that an optimal solution to the regularized training problem can be characterized as extreme points of a convex set.
In higher dimensions, we show that the training problem can be cast as a finite dimensional convex problem with infinitely many constraints.
arXiv Detail & Related papers (2020-02-25T23:05:33Z) - Optimal Randomized First-Order Methods for Least-Squares Problems [56.05635751529922]
This class of algorithms encompasses several randomized methods among the fastest solvers for least-squares problems.
We focus on two classical embeddings, namely, Gaussian projections and subsampled Hadamard transforms.
Our resulting algorithm yields the best complexity known for solving least-squares problems with no condition number dependence.
arXiv Detail & Related papers (2020-02-21T17:45:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.