Fugu-MT 論文翻訳(概要): Efficient Hyper-Parameter Search for LoRA via Language-aided Bayesian Optimization

論文の概要: Efficient Hyper-Parameter Search for LoRA via Language-aided Bayesian Optimization

arxiv url: http://arxiv.org/abs/2602.11171v1
Date: Mon, 19 Jan 2026 08:48:03 GMT
ステータス: 翻訳完了
システム内更新日: 2026-02-15 14:54:53.721803
Title: Efficient Hyper-Parameter Search for LoRA via Language-aided Bayesian Optimization
Title（参考訳）: 言語支援ベイズ最適化によるLoRAの高パラメータ探索
Authors: Baek Seong-Eun, Lee Jung-Mok, Kim Sung-Bin, Tae-Hyun Oh,
Abstract要約: 低ランク適応(LoRA)を用いた細調整大型言語モデル(LLM)は、リソース効率の良いパーソナライゼーションや特殊化を可能にする。本稿では,事前学習されたLLMのドメイン知識をベイズ最適化に統合するフレームワークを提案する。
参考スコア（独自算出の注目度）: 27.47526031899076
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Fine-tuning Large Language Models (LLMs) with Low-Rank Adaptation (LoRA) enables resource-efficient personalization or specialization, but it comes at the expense of additional hyperparameter tuning. Although LoRA makes fine-tuning efficient, it is highly sensitive to the choice of hyperparameters, and exhaustive hyperparameter search is still computationally very demanding. To address these challenges, we propose a framework that integrates the domain knowledge of pre-trained LLMs into Bayesian Optimization (BO) to efficiently search for LoRA hyperparameters. To leverage the informed knowledge of LLMs, we repurpose LLMs as a discrete-to-continuous mapping to link the hyperparameters and their domain knowledge with a continuous vector space, where BO is conducted. We design and control the mapping by language prompting, where we provide a domain-aware textual prompt describing the relationships among hyperparameters and their respective roles; thereby, we explicitly inject domain knowledge about LoRA into the LLM in natural language. Also, we model the residual information that is hard to linguistically describe in the prompt with an additional learnable token. This aids BO to sample more high-performing hyperparameters. In addition, by leveraging the observation of the strong correlation between the respective performance obtained from full and subset training datasets in LoRA training regimes, we introduce proxy training and evaluation with a data subset. This further increases the efficiency of our method. We demonstrate that our hyperparameter found with only about 30 iterations achieves more than 20% performance improvement over standard hyperparameters found from about 45,000 combinations.
Abstract（参考訳）: 低ランク適応(LoRA)を用いた微調整大型言語モデル(LLM)は、リソース効率の良いパーソナライゼーションや特殊化を可能にするが、追加のハイパーパラメータチューニングを犠牲にしている。 LoRAは微調整を効率的にするが、ハイパーパラメータの選択に非常に敏感であり、網羅的なハイパーパラメータ探索は依然として計算的に非常に要求される。これらの課題に対処するために,事前学習したLLMのドメイン知識をベイズ最適化(BO)に統合し,LoRAハイパーパラメータを効率的に探索するフレームワークを提案する。 LLMの知識を生かし,超パラメータとその領域知識を連続ベクトル空間に関連付けるために,LLMを離散連続写像として再利用する。言語プロンプトによりマッピングを設計・制御し,ハイパーパラメータ間の関係とその役割を記述したドメイン認識テキストプロンプトを提供する。また,さらに学習可能なトークンを付加して,プロンプト内で言語的に記述し難い残余情報をモデル化する。これによりBOはより高性能なハイパーパラメーターをサンプリングするのに役立つ。また、LoRAトレーニングシステムにおいて、フルおよびサブセットのトレーニングデータセットから得られた各パフォーマンスの強い相関を観測することにより、プロキシトレーニングとデータサブセットによる評価を導入する。これは我々の手法の効率をさらに高める。約30イテレーションで検出したハイパーパラメータは,約45,000の組み合わせから得られた標準ハイパーパラメータよりも20%以上のパフォーマンス向上を実現している。

論文の概要: Efficient Hyper-Parameter Search for LoRA via Language-aided Bayesian Optimization

関連論文リスト