Fugu-MT 論文翻訳(概要): StelLA: Subspace Learning in Low-rank Adaptation using Stiefel Manifold

論文の概要: StelLA: Subspace Learning in Low-rank Adaptation using Stiefel Manifold

arxiv url: http://arxiv.org/abs/2510.01938v1
Date: Thu, 02 Oct 2025 11:59:13 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-03 16:59:21.120495
Title: StelLA: Subspace Learning in Low-rank Adaptation using Stiefel Manifold
Title（参考訳）: StelLA:Stiefel Manifoldを用いた低ランク適応における部分空間学習
Authors: Zhizhong Li, Sina Sajadmanesh, Jingtao Li, Lingjuan Lyu,
Abstract要約: 低ランク適応(LoRA)は大規模事前訓練モデルのパラメータ効率向上手法として広く採用されている。 3要素分解$U!SVtop$を使用するLoRAの幾何学的拡張を提案する。
参考スコア（独自算出の注目度）: 51.93627542334909
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Low-rank adaptation (LoRA) has been widely adopted as a parameter-efficient technique for fine-tuning large-scale pre-trained models. However, it still lags behind full fine-tuning in performance, partly due to its insufficient exploitation of the geometric structure underlying low-rank manifolds. In this paper, we propose a geometry-aware extension of LoRA that uses a three-factor decomposition $U\!SV^\top$. Analogous to the structure of singular value decomposition (SVD), it separates the adapter's input and output subspaces, $V$ and $U$, from the scaling factor $S$. Our method constrains $U$ and $V$ to lie on the Stiefel manifold, ensuring their orthonormality throughout the training. To optimize on the Stiefel manifold, we employ a flexible and modular geometric optimization design that converts any Euclidean optimizer to a Riemannian one. It enables efficient subspace learning while remaining compatible with existing fine-tuning pipelines. Empirical results across a wide range of downstream tasks, including commonsense reasoning, math and code generation, image classification, and image generation, demonstrate the superior performance of our approach against the recent state-of-the-art variants of LoRA. Code is available at https://github.com/SonyResearch/stella.
Abstract（参考訳）: 低ランク適応(LoRA)は大規模事前訓練モデルのパラメータ効率向上手法として広く採用されている。しかし、幾何構造が低ランク多様体の根底にあるため、完全な微調整の遅れがまだ残っている。本稿では,3要素分解を$U\!で行うLoRAの幾何学的拡張を提案する。 SV^\top$。特異値分解(SVD)の構造に類似して、アダプタの入力と出力のサブスペースを$V$と$U$と、スケーリング係数$S$から分離する。我々の方法では、スティーフェル多様体上に位置するために$U$と$V$を制約し、トレーニングを通してそれらの正則性を保証する。スティーフェル多様体上で最適化するために、任意のユークリッド最適化をリーマン多様体に変換するフレキシブルでモジュラーな幾何最適化設計を用いる。既存の微調整パイプラインとの互換性を維持しながら、効率的なサブスペース学習を可能にする。一般的な推論,数学とコード生成,画像分類,画像生成など,さまざまなダウンストリームタスクにまたがる実験結果から,最近のLoRAの最先端版に対して,我々のアプローチが優れていることを示す。コードはhttps://github.com/SonyResearch/stella.comで入手できる。

論文の概要: StelLA: Subspace Learning in Low-rank Adaptation using Stiefel Manifold

関連論文リスト