Related papers: FRoD: Full-Rank Efficient Fine-Tuning with Rotational Degrees for Fast Convergence

FRoD: Full-Rank Efficient Fine-Tuning with Rotational Degrees for Fast Convergence

URL: http://arxiv.org/abs/2512.23485v1
Date: Mon, 29 Dec 2025 14:13:45 GMT
Title: FRoD: Full-Rank Efficient Fine-Tuning with Rotational Degrees for Fast Convergence
Authors: Guoan Wan, Tianyu Chen, Fangzheng Feng, Haoyi Zhou, Runhua Xu,
Abstract summary: We propose FRoD, a novel fine-tuning method that combines hierarchical joint decomposition with rotational degrees of freedom.<n>On 20 benchmarks spanning vision, reasoning, and language understanding, FRoD matches full model fine-tuning in accuracy, while using only 1.72% of trainable parameters under identical training budgets.
Score: 20.138989330054955
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Parameter-efficient fine-tuning (PEFT) methods have emerged as a practical solution for adapting large foundation models to downstream tasks, reducing computational and memory costs by updating only a small subset of parameters. Among them, approaches like LoRA aim to strike a balance between efficiency and expressiveness, but often suffer from slow convergence and limited adaptation capacity due to their inherent low-rank constraints. This trade-off hampers the ability of PEFT methods to capture complex patterns needed for diverse tasks. To address these challenges, we propose FRoD, a novel fine-tuning method that combines hierarchical joint decomposition with rotational degrees of freedom. By extracting a globally shared basis across layers and injecting sparse, learnable perturbations into scaling factors for flexible full-rank updates, FRoD enhances expressiveness and efficiency, leading to faster and more robust convergence. On 20 benchmarks spanning vision, reasoning, and language understanding, FRoD matches full model fine-tuning in accuracy, while using only 1.72% of trainable parameters under identical training budgets.

Related papers

Train Less, Infer Faster: Efficient Model Finetuning and Compression via Structured Sparsity [21.090365337326414]
Finetuning foundation language models (LMs) with billions of parameters is often impractical due to high computational costs, memory requirements, and the risk of overfitting.<n>We propose a scheme for effective finetuning via sparsification using training gates, which requires minimal trainable parameters.<n> Empirical results show it outperforms recent finetuning baselines in efficiency and performance.
arXiv Detail & Related papers (2026-02-09T20:20:29Z)
High-Rank Structured Modulation for Parameter-Efficient Fine-Tuning [57.85676271833619]
Low-rank Adaptation (LoRA) uses a low-rank update method to simulate full parameter fine-tuning.<n>We present textbfSMoA, a high-rank textbfStructured textbfMOdulation textbfAdapter that uses fewer trainable parameters while maintaining a higher rank.
arXiv Detail & Related papers (2026-01-12T13:06:17Z)
SDAR: A Synergistic Diffusion-AutoRegression Paradigm for Scalable Sequence Generation [62.14510717860079]
We propose a Synergistic Diffusion-Autoregression paradigm that unifies the training efficiency of autoregressive models with the parallel inference capability of diffusion.<n>SDAR performs a lightweight paradigm conversion that transforms a well-trained autoregressive (AR) model into a blockwise diffusion model through brief, data-efficient adaptation.<n>Building on this insight, SDAR achieves efficient AR-to-diffusion conversion with minimal cost, preserving AR-level performance while enabling parallel generation.
arXiv Detail & Related papers (2025-10-07T17:29:28Z)
Adaptive Deadline and Batch Layered Synchronized Federated Learning [66.93447103966439]
Federated learning (FL) enables collaborative model training across distributed edge devices while preserving data privacy, and typically operates in a round-based synchronous manner.<n>We propose ADEL-FL, a novel framework that jointly optimize per-round deadlines and user-specific batch sizes for layer-wise aggregation.
arXiv Detail & Related papers (2025-05-29T19:59:18Z)
Communication-Efficient Wireless Federated Fine-Tuning for Large-Scale AI Models [13.742950928229078]
Low-Rank Adaptation (LoRA) addresses these issues by training compact, low-rank matrices instead of fully fine-tuning large models.<n>This paper introduces a wireless federated LoRA fine-tuning framework that optimize both learning performance and communication efficiency.
arXiv Detail & Related papers (2025-05-01T06:15:38Z)
DeLoRA: Decoupling Angles and Strength in Low-rank Adaptation [44.99833362998488]
Decoupled Low-rank Adaptation (DeLoRA) is a novel finetuning method that normalizes and scales learnable low-rank matrices.<n>We show that DeLoRA matches or surpasses performance of competing PEFT methods, while exhibiting stronger robustness.
arXiv Detail & Related papers (2025-03-23T22:00:56Z)
Bilevel ZOFO: Bridging Parameter-Efficient and Zeroth-Order Techniques for Efficient LLM Fine-Tuning and Meta-Training [44.89297451402362]
Fine-tuning pre-trained Large Language Models (LLMs) for downstream tasks presents significant computational challenges.<n>We propose Bilevel-ZOFO, a bilevel optimization method that couples fast, local FO-PEFT adaptation at the inner level with stable, memory-efficient ZO updates of the full backbone at the outer level.<n>We show that Bilevel-ZOFO significantly outperforms existing ZO and FO-PEFT methods, achieving 2-4 times faster training while maintaining similar memory efficiency.
arXiv Detail & Related papers (2025-02-05T20:47:44Z)
Transformed Low-rank Adaptation via Tensor Decomposition and Its Applications to Text-to-image Models [32.68721299475496]
Low-Rank Adaptation (LoRA) and its variants have gained significant attention due to their effectiveness.<n>We propose a new PEFT method that combines two classes of adaptations, namely, transform and residual adaptations.<n>Experiments are conducted on fine-tuning Stable Diffusion models in subject-driven and controllable generation.
arXiv Detail & Related papers (2025-01-15T11:10:37Z)
ALoRE: Efficient Visual Adaptation via Aggregating Low Rank Experts [71.91042186338163]
ALoRE is a novel PETL method that reuses the hypercomplex parameterized space constructed by Kronecker product to Aggregate Low Rank Experts.<n>Thanks to the artful design, ALoRE maintains negligible extra parameters and can be effortlessly merged into the frozen backbone.
arXiv Detail & Related papers (2024-12-11T12:31:30Z)
Balancing LoRA Performance and Efficiency with Simple Shard Sharing [8.827921242078883]
textbfOptimal textbfShard textbfSharing textbfIntegration in textbfLoRA, a novel PEFT approach that addresses this trade-off through a simple shard-sharing mechanism.<n>Fossils significantly outperforms standard LoRA and its prominent variants in both model performance metrics and computational efficiency.
arXiv Detail & Related papers (2024-09-19T10:26:42Z)
Ferret: Federated Full-Parameter Tuning at Scale for Large Language Models [54.02863371927658]
Large Language Models (LLMs) have become indispensable in numerous real-world applications.<n>Ferret is the first first-order method with shared randomness to enable scalable full- parameter tuning of LLMs.<n>Ferret achieves high computational efficiency, reduced communication overhead, and fast convergence.
arXiv Detail & Related papers (2024-09-10T07:28:13Z)
SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning [63.93193829913252]
We propose an innovative METL strategy called SHERL for resource-limited scenarios. In the early route, intermediate outputs are consolidated via an anti-redundancy operation. In the late route, utilizing minimal late pre-trained layers could alleviate the peak demand on memory overhead.
arXiv Detail & Related papers (2024-07-10T10:22:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.