FLoRG: Federated Fine-tuning with Low-rank Gram Matrices and Procrustes Alignment
- URL: http://arxiv.org/abs/2602.17095v1
- Date: Thu, 19 Feb 2026 05:35:23 GMT
- Title: FLoRG: Federated Fine-tuning with Low-rank Gram Matrices and Procrustes Alignment
- Authors: Chuiyang Meng, Ming Tang, Vincent W. S. Wong,
- Abstract summary: We propose FLoRG, a federated fine-tuning framework which employs a single low-rank matrix for fine-tuning.<n>We show that FLoRG outperforms five state-of-the-art baseline schemes in the downstream task accuracy and can reduce the communication overhead by up to 2041$times$.
- Score: 19.973768722251393
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Parameter-efficient fine-tuning techniques such as low-rank adaptation (LoRA) enable large language models (LLMs) to adapt to downstream tasks efficiently. Federated learning (FL) further facilitates this process by enabling collaborative fine-tuning across distributed clients without sharing private data. However, the use of two separate low-rank matrices in LoRA for federated fine-tuning introduces two types of challenges. The first challenge arises from the error induced by separately aggregating those two low-rank matrices. The second challenge occurs even when the product of two low-rank matrices is aggregated. The server needs to recover factors via matrix decomposition, which is non-unique and can introduce decomposition drift. To tackle the aforementioned challenges, we propose FLoRG, a federated fine-tuning framework which employs a single low-rank matrix for fine-tuning and aggregates its Gram matrix (i.e., the matrix of inner products of its column vectors), eliminating the aggregation error while also reducing the communication overhead. FLoRG minimizes the decomposition drift by introducing a Procrustes alignment approach which aligns the decomposed matrix between consecutive fine-tuning rounds for consistent updates. We theoretically analyze the convergence of FLoRG and prove that adopting the Procrustes alignment results in a tighter convergence bound. Experimental results across multiple LLM fine-tuning benchmarks demonstrate that FLoRG outperforms five state-of-the-art baseline schemes in the downstream task accuracy and can reduce the communication overhead by up to 2041$\times$.
Related papers
- PT$^2$-LLM: Post-Training Ternarization for Large Language Models [52.4629647715623]
Large Language Models (LLMs) have shown impressive capabilities across diverse tasks, but their large memory and compute demands hinder deployment.<n>We propose PT$2$-LLM, a post-training ternarization framework tailored for LLMs.<n>At its core is an Asymmetric Ternary Quantizer equipped with a two-stage refinement pipeline.
arXiv Detail & Related papers (2025-09-27T03:01:48Z) - Row-Column Hybrid Grouping for Fault-Resilient Multi-Bit Weight Representation on IMC Arrays [8.430588029181136]
This paper addresses the computational unreliability caused by stuck-at faults (SAFs) and the high compilation overhead of fault-mitigation algorithms, namely Fault-Free (FF)<n>We first propose a novel multi-bit weight representation technique, termed row-column hybrid grouping, which generalizes conventional column grouping by introducing redundancy across both rows and columns.<n>Second, we design a compiler that reformulates the fault-aware weight decomposition problem as an analog Linear Programming (ILP) task, enabling fast and scalable compilation through off-the-shelf solvers.
arXiv Detail & Related papers (2025-08-21T16:05:44Z) - QR-LoRA: Efficient and Disentangled Fine-tuning via QR Decomposition for Customized Generation [52.024845354511555]
We propose QR-LoRA, a novel fine-tuning framework leveraging QR decomposition for structured parameter updates.<n>Our key insight is that the Q matrix naturally minimizes interference between different visual features.<n>Experiments demonstrate that QR-LoRA achieves superior disentanglement in content-style fusion tasks.
arXiv Detail & Related papers (2025-07-07T01:31:01Z) - Automatic Rank Determination for Low-Rank Adaptation via Submodular Function Maximization [56.78271181959529]
SubLoRA is a rank determination method for Low-Rank Adaptation (LoRA) based on submodular function.<n>Our method combines solid theoretical foundations, second-order accuracy, and practical computational efficiency.
arXiv Detail & Related papers (2025-07-02T15:56:40Z) - BOLT: Block-Orthonormal Lanczos for Trace estimation of matrix functions [2.4578723416255754]
In many large-scale applications, the matrices involved are too large to store or access in full, making a single mat-vec product infeasible.<n>We introduce Subblock SLQ, a variant of BOLT that operates only on small principal submatrices.<n>We provide theoretical guarantees and demonstrate strong empirical performance across a range of high-dimensional settings.
arXiv Detail & Related papers (2025-05-18T08:04:05Z) - LoRTA: Low Rank Tensor Adaptation of Large Language Models [70.32218116940393]
Low Rank Adaptation (LoRA) is a popular Efficient Fine Tuning (PEFT) method.<n>We propose a higher-order Candecomp/Parafac (CP) decomposition, enabling a more compact and flexible representation.<n>Our method can achieve a reduction in the number of parameters while maintaining comparable performance.
arXiv Detail & Related papers (2024-10-05T06:59:50Z) - CURLoRA: Stable LLM Continual Fine-Tuning and Catastrophic Forgetting Mitigation [0.0]
CURLoRA is a novel approach to fine-tuning large language models.
It mitigates catastrophic forgetting and reduces the number of trainable parameters.
It maintains model stability and performance across tasks while significantly reducing the number of trainable parameters.
arXiv Detail & Related papers (2024-08-26T18:42:59Z) - Generalized Low-Rank Matrix Completion Model with Overlapping Group Error Representation [3.457484690890009]
The low-rank matrix completion (LRMC) technology has achieved remarkable results in low-level visual tasks.
There is an underlying assumption that the real-world matrix data is low-rank in LRMC.
The real matrix data does not satisfy the strict low-rank property, which undoubtedly present serious challenges for the above-mentioned matrix recovery methods.
arXiv Detail & Related papers (2024-07-11T14:01:57Z) - Mode-wise Principal Subspace Pursuit and Matrix Spiked Covariance Model [13.082805815235975]
We introduce a novel framework called Mode-wise Principal Subspace Pursuit (MOP-UP) to extract hidden variations in both the row and column dimensions for matrix data.
The effectiveness and practical merits of the proposed framework are demonstrated through experiments on both simulated and real datasets.
arXiv Detail & Related papers (2023-07-02T13:59:47Z) - Orthogonal Nonnegative Matrix Factorization with Sparsity Constraints [0.0]
This article presents a novel approach to solving the sparsity-constrained Orthogonal Nonnegative Matrix Factorization (SCONMF) problem.<n>By reformulating SCONMF as a capacity-constrained facility-location problem, the proposed method naturally integrates non-negativity, orthogonality, and sparsity constraints.<n>Specifically, our approach integrates control-barrier function (CBF) based framework used for dynamic optimal control design problems with maximum-entropy-principle-based framework used for facility location problems to enforce these constraints while ensuring robust factorization.
arXiv Detail & Related papers (2022-10-06T04:30:59Z) - Semi-Supervised Subspace Clustering via Tensor Low-Rank Representation [64.49871502193477]
We propose a novel semi-supervised subspace clustering method, which is able to simultaneously augment the initial supervisory information and construct a discriminative affinity matrix.
Comprehensive experimental results on six commonly-used benchmark datasets demonstrate the superiority of our method over state-of-the-art methods.
arXiv Detail & Related papers (2022-05-21T01:47:17Z) - Solving weakly supervised regression problem using low-rank manifold
regularization [77.34726150561087]
We solve a weakly supervised regression problem.
Under "weakly" we understand that for some training points the labels are known, for some unknown, and for others uncertain due to the presence of random noise or other reasons such as lack of resources.
In the numerical section, we applied the suggested method to artificial and real datasets using Monte-Carlo modeling.
arXiv Detail & Related papers (2021-04-13T23:21:01Z) - Multi-Objective Matrix Normalization for Fine-grained Visual Recognition [153.49014114484424]
Bilinear pooling achieves great success in fine-grained visual recognition (FGVC)
Recent methods have shown that the matrix power normalization can stabilize the second-order information in bilinear features.
We propose an efficient Multi-Objective Matrix Normalization (MOMN) method that can simultaneously normalize a bilinear representation.
arXiv Detail & Related papers (2020-03-30T08:40:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.