Fugu-MT 論文翻訳(概要): Bayesian Active Learning with Gaussian Processes Guided by LLM Relevance Scoring for Dense Passage Retrieval

論文の概要: Bayesian Active Learning with Gaussian Processes Guided by LLM Relevance Scoring for Dense Passage Retrieval

arxiv url: http://arxiv.org/abs/2604.17906v1
Date: Mon, 20 Apr 2026 07:32:56 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-21 21:52:52.749042
Title: Bayesian Active Learning with Gaussian Processes Guided by LLM Relevance Scoring for Dense Passage Retrieval
Title（参考訳）: LLMレバレンス・スコアリングによる高密度パス検索のためのガウス過程によるベイズ能動的学習
Authors: Junyoung Kim, Anton Korikov, Jiazhou Liang, Justin Cui, Yifan Simon Liu, Qianfeng Wen, Mark Zhao, Scott Sanner,
Abstract要約: BAGELは、グローバルな探索を導くために、埋め込み空間をまたいでスパース関連信号を伝播する新しいフレームワークである。信頼度の高い地域の搾取と不確実な地域の探索とを戦略的にバランスさせることで、得点のための通路を反復的に選択する。
参考スコア（独自算出の注目度）: 20.193565383791213
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: While Large Language Models (LLMs) exhibit exceptional zero-shot relevance modeling, their high computational cost necessitates framing passage retrieval as a budget-constrained global optimization problem. Existing approaches passively rely on first-stage dense retrievers, which leads to two limitations: (1) failing to retrieve relevant passages in semantically distinct clusters, and (2) failing to propagate relevance signals to the broader corpus. To address these limitations, we propose Bayesian Active Learning with Gaussian Processes guided by LLM relevance scoring (BAGEL), a novel framework that propagates sparse LLM relevance signals across the embedding space to guide global exploration. BAGEL models the multimodal relevance distribution across the entire embedding space with a query-specific Gaussian Process (GP) based on LLM relevance scores. Subsequently, it iteratively selects passages for scoring by strategically balancing the exploitation of high-confidence regions with the exploration of uncertain areas. Extensive experiments across four benchmark datasets and two LLM backbones demonstrate that BAGEL effectively explores and captures complex relevance distributions and outperforms LLM reranking methods under the same LLM budget on all four datasets.
Abstract（参考訳）: 大規模言語モデル(LLM)は例外的なゼロショット関連性モデルを示すが、その高い計算コストは、予算制約のあるグローバル最適化問題としてフレーミングパス検索を必要とする。既存のアプローチは第1段階の高密度検索器に受動的に依存しており、(1)意味的に異なるクラスタ内の関連通路の検索に失敗したこと、(2)より広いコーパスへの関連シグナルの伝達に失敗したこと、の2つの制限が生じる。これらの制約に対処するため,LLMレバレンススコア(BAGEL)によって導かれるガウス的プロセスを用いたベイズ的アクティブラーニング(Bayesian Active Learning)を提案する。 BAGELは、LLMレバレンススコアに基づいて、クエリ固有のガウスプロセス(GP)を用いて、埋め込み空間全体にわたるマルチモーダルなレバレンス分布をモデル化する。その後、信頼度の高い地域の搾取と不確実な地域の探究とを戦略的にバランスさせて、得点のための通路を反復的に選択する。 4つのベンチマークデータセットと2つのLLMバックボーンにわたる大規模な実験により、BAGELは複雑な関連性分布を効果的に探索し、キャプチャし、4つのデータセットすべてで同じLLM予算の下でLLMのリグレードメソッドを上回ります。

論文の概要: Bayesian Active Learning with Gaussian Processes Guided by LLM Relevance Scoring for Dense Passage Retrieval

関連論文リスト