Fugu-MT 論文翻訳(概要): BoRA: Towards More Expressive Low-Rank Adaptation with Block Diversity

論文の概要: BoRA: Towards More Expressive Low-Rank Adaptation with Block Diversity

arxiv url: http://arxiv.org/abs/2508.06953v1
Date: Sat, 09 Aug 2025 11:58:39 GMT
ステータス: 翻訳完了
システム内更新日: 2025-08-12 21:23:28.626112
Title: BoRA: Towards More Expressive Low-Rank Adaptation with Block Diversity
Title（参考訳）: BoRA: ブロックの多様性を備えたより表現力豊かな低ランク適応を目指す
Authors: Shiwei Li, Xiandi Luo, Haozhao Wang, Xing Tang, Ziqiang Cui, Dugang Liu, Yuhua Li, Xiuqiang He, Ruixuan Li,
Abstract要約: 低ランク適応(LoRA)は、大規模言語モデルで広く使われているパラメータ効率のよい微調整(PEFT)手法である。本稿では,LoRA重みのランクを,少数のパラメータで改善するBlock Diversified Low-Rank Adaptation (BoRA)を提案する。
参考スコア（独自算出の注目度）: 23.25105718896569
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Low-rank adaptation (LoRA) is a parameter-efficient fine-tuning (PEFT) method widely used in large language models (LLMs). It approximates the update of a pretrained weight matrix $W\in\mathbb{R}^{m\times n}$ by the product of two low-rank matrices, $BA$, where $A \in\mathbb{R}^{r\times n}$ and $B\in\mathbb{R}^{m\times r} (r\ll\min\{m,n\})$. Increasing the dimension $r$ can raise the rank of LoRA weights (i.e., $BA$), which typically improves fine-tuning performance but also significantly increases the number of trainable parameters. In this paper, we propose Block Diversified Low-Rank Adaptation (BoRA), which improves the rank of LoRA weights with a small number of additional parameters. Specifically, BoRA treats the product $BA$ as a block matrix multiplication, where $A$ and $B$ are partitioned into $b$ blocks along the columns and rows, respectively (i.e., $A=[A_1,\dots,A_b]$ and $B=[B_1,\dots,B_b]^\top$). Consequently, the product $BA$ becomes the concatenation of the block products $B_iA_j$ for $i,j\in[b]$. To enhance the diversity of different block products, BoRA introduces a unique diagonal matrix $\Sigma_{i,j} \in \mathbb{R}^{r\times r}$ for each block multiplication, resulting in $B_i \Sigma_{i,j} A_j$. By leveraging these block-wise diagonal matrices, BoRA increases the rank of LoRA weights by a factor of $b$ while only requiring $b^2r$ additional parameters. Extensive experiments across multiple datasets and models demonstrate the superiority of BoRA, and ablation studies further validate its scalability.
Abstract（参考訳）: 低ランク適応 (LoRA) は、大規模言語モデル (LLM) で広く使われているパラメータ効率の良い微調整 (PEFT) 手法である。事前訓練された重み行列 $W\in\mathbb{R}^{m\times n}$ の更新を、2つの低ランク行列 $BA$ の積 $A \in\mathbb{R}^{r\times n}$ と $B\in\mathbb{R}^{m\times r} (r\ll\min\{m,n\})$ によって近似する。次元を$r$にすると、LoRAウェイト(例えば$BA$)のランクが上昇し、通常は微調整性能が向上するが、トレーニング可能なパラメータの数も大幅に増加する。本稿では,ロラ重みのランクを若干のパラメータで改善するBlock Diversified Low-Rank Adaptation (BoRA)を提案する。具体的には、BoRAは製品$BA$をブロック行列乗法として扱い、$A$と$B$はそれぞれ列と行に沿って$b$ブロックに分割される($A=[A_1,\dots,A_b]$と$B=[B_1,\dots,B_b]^\top$)。したがって、積 $BA$ はブロック積 $B_iA_j$ for $i,j\in[b]$ の連結となる。異なるブロック積の多様性を高めるため、BoRAはブロック乗法毎に独自の対角行列 $\Sigma_{i,j} \in \mathbb{R}^{r\times r}$ を導入し、結果として$B_i \Sigma_{i,j} A_j$ となる。これらのブロックワイド対角行列を利用することで、BoRAはロラ重みのランクを$b$の係数で増加させ、さらに$b^2r$のパラメータのみを必要とする。複数のデータセットやモデルにわたる大規模な実験は、BoRAの優位性を示し、アブレーション研究はそのスケーラビリティをさらに検証している。

論文の概要: BoRA: Towards More Expressive Low-Rank Adaptation with Block Diversity

関連論文リスト