ODELoRA: Training Low-Rank Adaptation by Solving Ordinary Differential Equations
- URL: http://arxiv.org/abs/2602.07479v1
- Date: Sat, 07 Feb 2026 10:19:36 GMT
- Title: ODELoRA: Training Low-Rank Adaptation by Solving Ordinary Differential Equations
- Authors: Yihang Gao, Vincent Y. F. Tan,
- Abstract summary: Low-rank adaptation (LoRA) has emerged as a widely adopted parameter-efficient fine-tuning method in deep transfer learning.<n>We propose a novel continuous-time optimization dynamic for LoRA factor matrices in the form of an ordinary differential equation (ODE)<n>We show that ODELoRA achieves stable feature learning, a property that is crucial for training deep neural networks at different scales of problem dimensionality.
- Score: 54.886931928255564
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Low-rank adaptation (LoRA) has emerged as a widely adopted parameter-efficient fine-tuning method in deep transfer learning, due to its reduced number of trainable parameters and lower memory requirements enabled by Burer-Monteiro factorization on adaptation matrices. However, classical LoRA training methods treat the low-rank factor matrices individually and optimize them using standard gradient-based algorithms. Such decoupled optimization schemes are theoretically and empirically suboptimal, as they fail to fully exploit the intrinsic structure of the LoRA parameterization. In this work, we propose a novel continuous-time optimization dynamic for LoRA factor matrices in the form of an ordinary differential equation (ODE) that emulates the gradient flow of full fine-tuning on the balanced manifold. We term this approach ODELoRA. To faithfully track the trajectories of ODELoRA, we adopt well-established and theoretically grounded time-discretization schemes, including Euler and Runge--Kutta methods. Our framework provides a unified ODE-based perspective for understanding and designing LoRA training algorithms. We establish linear convergence of the proposed method under strongly convex objectives for certain discretization schemes under mild conditions, and further extend our analysis to the matrix sensing setting. Moreover, we show that ODELoRA achieves stable feature learning, a property that is crucial for training deep neural networks at different scales of problem dimensionality. Empirical results on matrix sensing tasks confirm the derived linear convergence behavior, and experiments on training physics-informed neural networks further demonstrate the superiority of ODELoRA over existing baselines, especially in the training stability.
Related papers
- LoRA-DA: Data-Aware Initialization for Low-Rank Adaptation via Asymptotic Analysis [33.708800231646606]
We establish a theoretical framework for data-aware LoRA.<n>We develop an efficient algorithm, LoRA-DA, which estimates the terms in the optimization problem from a small set of target domain samples.<n>Additional studies show faster, more stable convergence, robustness across ranks, and only a small overhead for LoRA-DA.
arXiv Detail & Related papers (2025-10-28T15:55:36Z) - Deep Equilibrium models for Poisson Imaging Inverse problems via Mirror Descent [7.248102801711294]
Deep Equilibrium Models (DEQs) are implicit neural networks with fixed points.<n>We introduce a novel DEQ formulation based on Mirror Descent defined in terms of a tailored non-Euclidean geometry.<n>We propose computational strategies that enable both efficient training and fully parameter-free inference.
arXiv Detail & Related papers (2025-07-15T16:33:01Z) - Automatic Rank Determination for Low-Rank Adaptation via Submodular Function Maximization [56.78271181959529]
SubLoRA is a rank determination method for Low-Rank Adaptation (LoRA) based on submodular function.<n>Our method combines solid theoretical foundations, second-order accuracy, and practical computational efficiency.
arXiv Detail & Related papers (2025-07-02T15:56:40Z) - DiffoRA: Enabling Parameter-Efficient Fine-Tuning via Differential Module Selection [32.369133126167085]
Low-Rank Adaptation (LoRA) has gained popularity for its streamlined design by incorporating low-rank matrices into existing pre-trained models.<n>We propose DiffoRA, which enables adaptive adoption of the low-rank decomposition matrices.
arXiv Detail & Related papers (2025-02-13T02:41:34Z) - Training Deep Learning Models with Norm-Constrained LMOs [56.00317694850397]
We propose a new family of algorithms that uses the linear minimization oracle (LMO) to adapt to the geometry of the problem.<n>We demonstrate significant speedups on nanoGPT training using our algorithm, Scion, without any reliance on Adam.
arXiv Detail & Related papers (2025-02-11T13:10:34Z) - SD-LoRA: Scalable Decoupled Low-Rank Adaptation for Class Incremental Learning [73.93639228235622]
Continual Learning with foundation models has emerged as a promising paradigm to exploit abundant knowledge acquired during pre-training for tackling sequential tasks.<n>Existing prompt-based and Low-Rank Adaptation-based (LoRA-based) methods often require expanding a prompt/LoRA pool or retaining samples of previous tasks.<n>We propose Scalable Decoupled LoRA (SD-LoRA) for class incremental learning, which continually separates the learning of the magnitude and direction of LoRA components without rehearsal.
arXiv Detail & Related papers (2025-01-22T20:00:41Z) - Conformal Symplectic Optimization for Stable Reinforcement Learning [21.491621524500736]
By utilizing relativistic kinetic energy, RAD incorporates from special relativity and limits parameter updates below a finite speed, effectively mitigating abnormal influences.<n> Notably, RAD achieves up to a 155.1% performance improvement, showcasing its efficacy in training Atari games.
arXiv Detail & Related papers (2024-12-03T09:07:31Z) - Randomized Asymmetric Chain of LoRA: The First Meaningful Theoretical Framework for Low-Rank Adaptation [58.288682735160585]
Low-Rank Adaptation (LoRA) is a popular technique for finetuning models.
LoRA often under performs when compared to full- parameter fine-tuning.
We present a framework that rigorously analyzes the adaptation rates of LoRA methods.
arXiv Detail & Related papers (2024-10-10T18:51:53Z) - Large-Scale OD Matrix Estimation with A Deep Learning Method [70.78575952309023]
The proposed method integrates deep learning and numerical optimization algorithms to infer matrix structure and guide numerical optimization.
We conducted tests to demonstrate the good generalization performance of our method on a large-scale synthetic dataset.
arXiv Detail & Related papers (2023-10-09T14:30:06Z) - Enhancing training of physics-informed neural networks using
domain-decomposition based preconditioning strategies [1.8434042562191815]
We introduce additive and multiplicative preconditioning strategies for the widely used L-BFGS.
We demonstrate that both additive and multiplicative preconditioners significantly improve the convergence of the standard L-BFGS.
arXiv Detail & Related papers (2023-06-30T13:35:09Z) - Pushing the Envelope of Rotation Averaging for Visual SLAM [69.7375052440794]
We propose a novel optimization backbone for visual SLAM systems.
We leverage averaging to improve the accuracy, efficiency and robustness of conventional monocular SLAM systems.
Our approach can exhibit up to 10x faster with comparable accuracy against the state-art on public benchmarks.
arXiv Detail & Related papers (2020-11-02T18:02:26Z) - An Ode to an ODE [78.97367880223254]
We present a new paradigm for Neural ODE algorithms, called ODEtoODE, where time-dependent parameters of the main flow evolve according to a matrix flow on the group O(d)
This nested system of two flows provides stability and effectiveness of training and provably solves the gradient vanishing-explosion problem.
arXiv Detail & Related papers (2020-06-19T22:05:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.