Fugu-MT 論文翻訳(概要): Queryable LoRA: Instruction-Regularized Routing Over Shared Low-Rank Update Atoms

論文の概要: Queryable LoRA: Instruction-Regularized Routing Over Shared Low-Rank Update Atoms

arxiv url: http://arxiv.org/abs/2605.08423v1
Date: Fri, 08 May 2026 19:32:43 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-12 23:28:49.632938
Title: Queryable LoRA: Instruction-Regularized Routing Over Shared Low-Rank Update Atoms
Title（参考訳）: Queryable LoRA: 共有低ランク更新Atom上での命令規則化ルーティング
Authors: Omatharv Bharat Vaidya, Connor T. Jerzak, Nhat Ho, Chandrajit Bajaj,
Abstract要約: 本稿では,大規模ニューラルネットワークのパラメータ効率向上のためのデータ適応手法を提案する。我々のアプローチは、純粋なレイヤローカルアダプタを低ランク更新原子の共有クエリ可能なメモリに置き換える。
参考スコア（独自算出の注目度）: 41.71279032037811
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We present a data-adaptive method for parameter-efficient fine-tuning of large neural networks. Standard low-rank adaptation methods improve efficiency by restricting each layer update to a fixed low-rank form, but this static parameterization can be too rigid when the appropriate correction depends on the input and on the evolving depth-wise computation of the network. Our approach replaces a purely layer-local adapter with a shared queryable memory of low-rank update atoms. For each block of layers, the model forms a query from the current low-rank state and a running summary of previous blocks, uses this query to retrieve a content-dependent combination of shared update components via attention, and applies the resulting routed operator within the low-rank bottleneck. In this way, the method retains the efficiency and scalability of low-rank adaptation while allowing the effective update to vary across inputs and to share reusable structure across layers. The resulting architecture provides a principled middle ground between static LoRA-style updates and fully generated parameter updates: it remains compact and parameter-efficient while supporting dynamic, context-sensitive adaptation. Further, we incorporate instruction-regularization by augmenting routing logits with a language-induced prior over update atoms, thereby biasing the selection of low-rank transformations toward semantically relevant directions without generating unconstrained parameter updates. Experiments on noisy non-linear regression tasks and LLM fine-tuning suggest that this queryable update-memory formulation can improve final test performance and training stability compared to standard low-rank adaptation, while using a comparable number of trainable parameters.
Abstract（参考訳）: 本稿では,大規模ニューラルネットワークのパラメータ効率向上のためのデータ適応手法を提案する。標準の低ランク適応法は、各レイヤの更新を固定された低ランク形式に制限することで効率を向上するが、この静的パラメータ化は、適切な修正が入力に依存する場合やネットワークの深度ワイド計算に依存する場合、厳密すぎる可能性がある。我々のアプローチは、純粋なレイヤローカルアダプタを低ランク更新原子の共有クエリ可能なメモリに置き換える。各レイヤブロックに対して、モデルが現在のローランク状態からのクエリと以前のブロックの実行サマリーを生成し、このクエリを使用して、共有更新コンポーネントのコンテンツ依存の組み合わせを注意して検索し、結果として得られたルーティングされた演算子をローランクボトルネックに当てはめます。このようにして、この手法は低ランク適応の効率とスケーラビリティを維持しつつ、効果的な更新を入力毎に変更し、レイヤ間で再利用可能な構造を共有することができる。結果として得られたアーキテクチャは、静的なLoRAスタイルの更新と完全に生成されたパラメータの更新の間に、原則化された中間層を提供する。さらに、命令規則化を、言語によって引き起こされた事前更新原子によるルーティングロジットを増大させることにより、制約のないパラメータ更新を発生させることなく、意味的に関連する方向へのローランク変換の選択をバイアスする。ノイズの多い非線形回帰タスクとLLM微調整の実験から、このクエリ可能な更新メモリの定式化は、トレーニング可能なパラメータの数に匹敵する数を使いながら、通常の低ランク適応よりも最終的なテスト性能とトレーニングの安定性を向上させることが示唆されている。

論文の概要: Queryable LoRA: Instruction-Regularized Routing Over Shared Low-Rank Update Atoms

関連論文リスト