Related papers: LSR-Adapt: Ultra-Efficient Parameter Tuning with Matrix Low Separation Rank Kernel Adaptation

LSR-Adapt: Ultra-Efficient Parameter Tuning with Matrix Low Separation Rank Kernel Adaptation

URL: http://arxiv.org/abs/2502.13568v1
Date: Wed, 19 Feb 2025 09:20:47 GMT
Title: LSR-Adapt: Ultra-Efficient Parameter Tuning with Matrix Low Separation Rank Kernel Adaptation
Authors: Xin Li, Anand Sarwate,
Abstract summary: Low rank based adaptation has become increasingly challenging due to the sheer scale of modern large language models.<n>We propose an effective kernelization to further reduce the number of parameters required for adaptation tasks.<n>We achieve state-of-the-art performance with even higher accuracy with almost half the number of parameters as compared to conventional low rank based methods.
Score: 3.9426000822656224
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Imposing an effective structural assumption on neural network weight matrices has been the major paradigm for designing Parameter-Efficient Fine-Tuning (PEFT) systems for adapting modern large pre-trained models to various downstream tasks. However, low rank based adaptation has become increasingly challenging due to the sheer scale of modern large language models. In this paper, we propose an effective kernelization to further reduce the number of parameters required for adaptation tasks. Specifically, from the classical idea in numerical analysis regarding matrix Low-Separation-Rank (LSR) representations, we develop a kernel using this representation for the low rank adapter matrices of the linear layers from large networks, named the Low Separation Rank Adaptation (LSR-Adapt) kernel. With the ultra-efficient kernel representation of the low rank adapter matrices, we manage to achieve state-of-the-art performance with even higher accuracy with almost half the number of parameters as compared to conventional low rank based methods. This structural assumption also opens the door to further GPU-side optimizations due to the highly parallelizable nature of Kronecker computations.

Related papers

MaRI: Accelerating Ranking Model Inference via Structural Re-parameterization in Large Scale Recommendation System [24.4139949756995]
We propose MaRI, a novel Matrix Re- Parametersized Inference framework.<n>It serves as a complementary approach to existing techniques while accelerating ranking model inference without any accuracy loss.<n>MaRI is motivated by the observation that user-side computation is redundant in feature fusion matrix multiplication.
arXiv Detail & Related papers (2026-02-26T15:19:43Z)
ODELoRA: Training Low-Rank Adaptation by Solving Ordinary Differential Equations [54.886931928255564]
Low-rank adaptation (LoRA) has emerged as a widely adopted parameter-efficient fine-tuning method in deep transfer learning.<n>We propose a novel continuous-time optimization dynamic for LoRA factor matrices in the form of an ordinary differential equation (ODE)<n>We show that ODELoRA achieves stable feature learning, a property that is crucial for training deep neural networks at different scales of problem dimensionality.
arXiv Detail & Related papers (2026-02-07T10:19:36Z)
TeRA: Vector-based Random Tensor Network for High-Rank Adaptation of Large Language Models [6.968486021891596]
We propose a vector-based random underlinetextbfTensor network for high-underlinetextbfRank underlinetextbfAdaptation (TeRA)<n>This is achieved by parameterizing the tensorized weight update matrix as a Tucker-like tensor network (TN)<n>Experiments demonstrate that TeRA matches or even outperforms high-rank adapters, while requiring a trainable parameter count similar to vector-based methods.
arXiv Detail & Related papers (2025-09-03T11:46:24Z)
Singular Value Decomposition on Kronecker Adaptation for Large Language Model [0.8747606955991707]
Large pre-trained Transformer models achieve state-of-the-art results across diverse language and reasoning tasks.<n>Full fine-tuning incurs substantial storage, memory, and computational overhead.<n>We propose SoKA, a novel PEFT strategy that combines Kronecker-product tensor factorization with SVD-driven initialization and dynamic rank selection.
arXiv Detail & Related papers (2025-06-18T08:28:53Z)
OSoRA: Output-Dimension and Singular-Value Initialized Low-Rank Adaptation [9.048461365342204]
We present OSoRA, a novel PEFT method for Large Language Models (LLMs)<n>OSoRA substantially reduces computational resource requirements by minimizing the number of trainable parameters during fine-tuning.<n> Comprehensive evaluations across mathematical reasoning, common sense reasoning, and other benchmarks demonstrate that OSoRA achieves comparable or superior performance to state-of-the-art methods.
arXiv Detail & Related papers (2025-05-20T13:34:06Z)
Nonconvex Linear System Identification with Minimal State Representation [34.203983563629144]
Low-order linear System IDent (SysID) addresses the challenge of estimating the parameters of a linear dynamical system from finite samples of inputs with minimal state observations.
arXiv Detail & Related papers (2025-04-26T04:11:02Z)
Expanding Sparse Tuning for Low Memory Usage [103.43560327427647]
We propose a method named SNELL (Sparse tuning with kerNELized LoRA) for sparse tuning with low memory usage. To achieve low memory usage, SNELL decomposes the tunable matrix for sparsification into two learnable low-rank matrices. A competition-based sparsification mechanism is further proposed to avoid the storage of tunable weight indexes.
arXiv Detail & Related papers (2024-11-04T04:58:20Z)
LoKO: Low-Rank Kalman Optimizer for Online Fine-Tuning of Large Models [21.889177019111525]
Training large models with millions or even billions of parameters from scratch incurs substantial computational costs. We use Low-Rank Adaptation (LoRA) to adapt only a reduced number of parameters to specific tasks with gradient-baseds. We propose robust approaches that work well across a vast range of well-established computer vision and language models.
arXiv Detail & Related papers (2024-10-15T12:41:31Z)
LoRTA: Low Rank Tensor Adaptation of Large Language Models [70.32218116940393]
Low Rank Adaptation (LoRA) is a popular Efficient Fine Tuning (PEFT) method.<n>We propose a higher-order Candecomp/Parafac (CP) decomposition, enabling a more compact and flexible representation.<n>Our method can achieve a reduction in the number of parameters while maintaining comparable performance.
arXiv Detail & Related papers (2024-10-05T06:59:50Z)
Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation [12.07880147193174]
We show that by leveraging the inherent low-dimensional structures of data and compressible dynamics within the model parameters, we can reap the benefits of over parameterization without the computational burdens. We demonstrate the effectiveness of this approach for deep low-rank matrix completion as well as fine-tuning language models.
arXiv Detail & Related papers (2024-06-06T14:29:49Z)
Spectrum-Aware Parameter Efficient Fine-Tuning for Diffusion Models [73.88009808326387]
We propose a novel spectrum-aware adaptation framework for generative models. Our method adjusts both singular values and their basis vectors of pretrained weights. We introduce Spectral Ortho Decomposition Adaptation (SODA), which balances computational efficiency and representation capacity.
arXiv Detail & Related papers (2024-05-31T17:43:35Z)
Exact Gauss-Newton Optimization for Training Deep Neural Networks [0.0]
We present EGN, a second-order optimization algorithm that combines the generalized Gauss-Newton (GN) Hessian approximation with low-rank linear algebra to compute the descent direction. We show how improvements such as line search, adaptive regularization, and momentum can be seamlessly added to EGN to further accelerate the algorithm.
arXiv Detail & Related papers (2024-05-23T10:21:05Z)
Iterative Soft Shrinkage Learning for Efficient Image Super-Resolution [91.3781512926942]
Image super-resolution (SR) has witnessed extensive neural network designs from CNN to transformer architectures. This work investigates the potential of network pruning for super-resolution iteration to take advantage of off-the-shelf network designs and reduce the underlying computational overhead. We propose a novel Iterative Soft Shrinkage-Percentage (ISS-P) method by optimizing the sparse structure of a randomly network at each and tweaking unimportant weights with a small amount proportional to the magnitude scale on-the-fly.
arXiv Detail & Related papers (2023-03-16T21:06:13Z)
Spectral Tensor Train Parameterization of Deep Learning Layers [136.4761580842396]
We study low-rank parameterizations of weight matrices with embedded spectral properties in the Deep Learning context. We show the effects of neural network compression in the classification setting and both compression and improved stability training in the generative adversarial training setting.
arXiv Detail & Related papers (2021-03-07T00:15:44Z)
Kernel and Rich Regimes in Overparametrized Models [69.40899443842443]
We show that gradient descent on overparametrized multilayer networks can induce rich implicit biases that are not RKHS norms. We also demonstrate this transition empirically for more complex matrix factorization models and multilayer non-linear networks.
arXiv Detail & Related papers (2020-02-20T15:43:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.