Fugu-MT 論文翻訳(概要): LLM-guided Hierarchical Retrieval

論文の概要: LLM-guided Hierarchical Retrieval

arxiv url: http://arxiv.org/abs/2510.13217v1
Date: Wed, 15 Oct 2025 07:05:17 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-16 20:13:28.537107
Title: LLM-guided Hierarchical Retrieval
Title（参考訳）: LLM誘導階層検索
Authors: Nilesh Gupta, Wei-Cheng Chang, Ngot Bui, Cho-Jui Hsieh, Inderjit S. Dhillon,
Abstract要約: LATTICEは階層的な検索フレームワークであり、LLMは対数探索の複雑さで大きなコーパスを推論し、ナビゲートすることができる。 LLM誘導探索における中心的な課題は、モデルの関連性判断がノイズが多く、文脈に依存し、階層性に気付かないことである。我々のフレームワークは、推論集約型BRIGHTベンチマークで最先端のゼロショット性能を実現する。
参考スコア（独自算出の注目度）: 54.73080745446999
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Modern IR systems are increasingly tasked with answering complex, multi-faceted queries that require deep reasoning rather than simple keyword or semantic matching. While LLM-based IR has shown great promise, the prevailing retrieve-then-rerank paradigm inherits the limitations of embedding-based retrieval; parametric generative approaches are difficult to update with new information; and long-context methods that place the entire corpus in context are computationally infeasible for large document collections. To address these challenges, we introduce LATTICE, a hierarchical retrieval framework that enables an LLM to reason over and navigate large corpora with logarithmic search complexity by imposing a semantic tree structure on the corpus. Our approach consists of two stages: (1) an offline phase that organizes the corpus into a semantic hierarchy via either a bottom-up agglomerative strategy or a top-down divisive strategy using multi-level summaries and (2) an online traversal phase where a search LLM navigates this tree. A central challenge in such LLM-guided search is that the model's relevance judgments are noisy, context-dependent, and unaware of the hierarchy, making cross-branch and cross-level comparisons difficult. To overcome this, we propose a traversal algorithm that estimates calibrated latent relevance scores from local LLM outputs and aggregates them into a global path relevance metric. Our training-free framework achieves state-of-the-art zero-shot performance on the reasoning-intensive BRIGHT benchmark, demonstrating up to 9% improvement in Recall@100 and 5% in nDCG@10 over the next best zero-shot baseline. Furthermore, compared to the fine-tuned SOTA method DIVER-v2, LATTICE attains comparable results on BRIGHT subsets that use a static corpus for evaluation.
Abstract（参考訳）: 現代のIRシステムは、単純なキーワードやセマンティックマッチングではなく、深い推論を必要とする複雑で多面的なクエリに答えることがますます課題になっている。 LLMベースのIRは大きな可能性を示してきたが、一般的な検索-待ち時間パラダイムは埋め込みベースの検索の限界を継承し、パラメトリック生成アプローチは新しい情報で更新することは困難である。これらの課題に対処するために,LLMが対数探索の複雑さを伴って大きなコーパスを推論・ナビゲートできる階層的検索フレームワークであるLATTICEを紹介した。提案手法は,(1)ボトムアップ・アグリメティブ・ストラテジーまたはトップダウン・ディビジョン・ストラテジーを通じてコーパスをセマンティック・階層に編成するオフラインフェーズと,(2)検索用LLMがこのツリーをナビゲートするオンライン・トラバース・フェーズの2段階から構成される。 LLM誘導探索における中心的な課題は、モデルの関連性判断がノイズが多く、文脈依存的であり、階層性に気付かないため、クロスブランチとクロスレベル比較が難しいことである。これを解決するために,局所的なLCM出力からキャリブレーションされた潜在関連度スコアを推定し,それらをグローバルパス関連度尺度に集約するトラバースアルゴリズムを提案する。我々のトレーニングフリーフレームワークは、推論集約的なBRIGHTベンチマークで最先端のゼロショットパフォーマンスを実現し、次の最高のゼロショットベースラインに対して、Recall@100が9%改善、nDCG@10が5%改善した。さらに、微調整されたSOTA法であるDIVER-v2と比較して、LATTICEは評価に静的コーパスを使用するBRIGHTサブセットに匹敵する結果が得られる。

論文の概要: LLM-guided Hierarchical Retrieval

関連論文リスト