Related papers: COALA: Numerically Stable and Efficient Framework for Context-Aware Low-Rank Approximation

COALA: Numerically Stable and Efficient Framework for Context-Aware Low-Rank Approximation

URL: http://arxiv.org/abs/2507.07580v1
Date: Thu, 10 Jul 2025 09:35:22 GMT
Title: COALA: Numerically Stable and Efficient Framework for Context-Aware Low-Rank Approximation
Authors: Uliana Parkina, Maxim Rakhuba,
Abstract summary: contexts-aware low-rank approximation is a useful tool for compression and fine-tuning of modern large-scale neural networks.<n>Existing methods for neural networks suffer from numerical instabilities due to their reliance on classical formulas involving explicit Gram matrix computation and their subsequent inversion.<n>We propose a novel inversion-free regularized framework that is based entirely on stable decompositions and overcomes the numerical pitfalls of prior art.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent studies suggest that context-aware low-rank approximation is a useful tool for compression and fine-tuning of modern large-scale neural networks. In this type of approximation, a norm is weighted by a matrix of input activations, significantly improving metrics over the unweighted case. Nevertheless, existing methods for neural networks suffer from numerical instabilities due to their reliance on classical formulas involving explicit Gram matrix computation and their subsequent inversion. We demonstrate that this can degrade the approximation quality or cause numerically singular matrices. To address these limitations, we propose a novel inversion-free regularized framework that is based entirely on stable decompositions and overcomes the numerical pitfalls of prior art. Our method can handle possible challenging scenarios: (1) when calibration matrices exceed GPU memory capacity, (2) when input activation matrices are nearly singular, and even (3) when insufficient data prevents unique approximation. For the latter, we prove that our solution converges to a desired approximation and derive explicit error bounds.

Related papers

Computational Efficient and Minimax Optimal Nonignorable Matrix Completion [2.2306682526405868]
We propose a nuclear norm regularized row- and column-wise matrix U-statistic loss function for the generalized nonignorable missing mechanism.<n>The proposed method achieves computational efficiency comparable to the existing missing-at-random approaches.
arXiv Detail & Related papers (2025-04-05T01:41:53Z)
Stochastic Optimization for Non-convex Problem with Inexact Hessian Matrix, Gradient, and Function [99.31457740916815]
Trust-region (TR) and adaptive regularization using cubics have proven to have some very appealing theoretical properties. We show that TR and ARC methods can simultaneously provide inexact computations of the Hessian, gradient, and function values.
arXiv Detail & Related papers (2023-10-18T10:29:58Z)
The Decimation Scheme for Symmetric Matrix Factorization [0.0]
Matrix factorization is an inference problem that has acquired importance due to its vast range of applications. We study this extensive rank problem, extending the alternative 'decimation' procedure that we recently introduced. We introduce a simple algorithm based on a ground state search that implements decimation and performs matrix factorization.
arXiv Detail & Related papers (2023-07-31T10:53:45Z)
Expressing linear equality constraints in feedforward neural networks [9.918927210224165]
We introduce a new saddle-point Lagrangian with predictor auxiliary variables on which constraints are imposed. Elimination of the auxiliary variables leads to a dual minimization problem on the Lagrange multipliers introduced to satisfy the linear constraints. We obtain the surprising interpretation of Lagrange parameters as additional, penultimate layer hidden units with fixed weights stemming from the constraints.
arXiv Detail & Related papers (2022-11-08T17:39:05Z)
Matrix Completion via Non-Convex Relaxation and Adaptive Correlation Learning [90.8576971748142]
We develop a novel surrogate that can be optimized by closed-form solutions. We exploit upperwise correlation for completion, and thus an adaptive correlation learning model.
arXiv Detail & Related papers (2022-03-04T08:50:50Z)
Faster One-Sample Stochastic Conditional Gradient Method for Composite Convex Minimization [61.26619639722804]
We propose a conditional gradient method (CGM) for minimizing convex finite-sum objectives formed as a sum of smooth and non-smooth terms. The proposed method, equipped with an average gradient (SAG) estimator, requires only one sample per iteration. Nevertheless, it guarantees fast convergence rates on par with more sophisticated variance reduction techniques.
arXiv Detail & Related papers (2022-02-26T19:10:48Z)
Solving weakly supervised regression problem using low-rank manifold regularization [77.34726150561087]
We solve a weakly supervised regression problem. Under "weakly" we understand that for some training points the labels are known, for some unknown, and for others uncertain due to the presence of random noise or other reasons such as lack of resources. In the numerical section, we applied the suggested method to artificial and real datasets using Monte-Carlo modeling.
arXiv Detail & Related papers (2021-04-13T23:21:01Z)
Low-Rank Matrix Recovery with Scaled Subgradient Methods: Fast and Robust Convergence Without the Condition Number [34.0533596121548]
Many problems in data science can be treated as estimating a low-rank from highly incomplete, sometimes even corrupted, observations. One popular approach is to resort to matrix factorization, where the low-rank matrix factors are optimized via first-order methods over a smooth loss.
arXiv Detail & Related papers (2020-10-26T06:21:14Z)
A Scalable, Adaptive and Sound Nonconvex Regularizer for Low-rank Matrix Completion [60.52730146391456]
We propose a new non scalable low-rank regularizer called "nuclear Frobenius norm" regularizer, which is adaptive and sound. It bypasses the computation of singular values and allows fast optimization by algorithms. It obtains state-of-the-art recovery performance while being the fastest in existing matrix learning methods.
arXiv Detail & Related papers (2020-08-14T18:47:58Z)
Understanding Implicit Regularization in Over-Parameterized Single Index Model [55.41685740015095]
We design regularization-free algorithms for the high-dimensional single index model. We provide theoretical guarantees for the induced implicit regularization phenomenon.
arXiv Detail & Related papers (2020-07-16T13:27:47Z)
Accelerating Ill-Conditioned Low-Rank Matrix Estimation via Scaled Gradient Descent [34.0533596121548]
Low-rank matrix estimation converges convex problem that finds numerous applications in signal processing, machine learning and imaging science. We show that ScaledGD achieves the best of the best in terms of the number of the low-rank matrix. Our analysis is also applicable to general loss that are similar to low-rank gradient descent.
arXiv Detail & Related papers (2020-05-18T17:17:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.