Fugu-MT 論文翻訳(概要): Improving the Efficiency and Effectiveness of LLM Knowledge Distillation for Conversational Search

論文の概要: Improving the Efficiency and Effectiveness of LLM Knowledge Distillation for Conversational Search

arxiv url: http://arxiv.org/abs/2606.04650v1
Date: Wed, 03 Jun 2026 09:22:22 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-04 20:44:18.654922
Title: Improving the Efficiency and Effectiveness of LLM Knowledge Distillation for Conversational Search
Title（参考訳）: 会話検索におけるLLM知識蒸留の有効性と有効性の改善
Authors: Stan Fris, Jan Hutter, Jan Henrik Bertrand, Simon Lupart, Mohammad Aliannejadi,
Abstract要約: Conversational Search (CS)は、会話コンテキストに基づく関連文書の検索を検討する。最近の研究は、KLD(Kullback-Leibler Divergence)を蒸留に応用し、教師信号とのアライメントを緩和している。会話探索におけるKLDに基づく蒸留の諸側面について検討する。
参考スコア（独自算出の注目度）: 13.420218766285133
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Conversational Search (CS) considers retrieval of relevant documents based on conversational context. Large Language Models (LLMs) have significantly enhanced CS by enabling effective query rewriting. However, employing LLMs during inference poses efficiency challenges. A method to balance effectiveness and efficiency is the use of knowledge distillation from LLM-based query rewriting. Recent work applies the Kullback-Leibler Divergence (KLD) for distillation, relaxing the alignment with the teacher signal compared to previous methods. Despite these gains, several aspects of KLD-based distillation for conversational search remain understudied, and we investigate them in this work. Prior work in related fields suggests that adding a contrastive loss to the KLD objective can improve performance; we confirm this and observe significant gains in precision-oriented ranking metrics. We also find that contrastive sampling strategies for the KLD loss have a non-trivial impact and must be chosen carefully. Although theory suggests that more samples improve the KLD estimate, experiments show diminishing returns on the number of used samples. Finally, we address the phenomenon of decreased sparsity in longer conversations, which limits computational efficiency across sparse retrieval methods. We find that the representations from the model distilled with the KLD loss can be strongly regularized with a regularization loss, substantially improving sparsity and inference efficiency without significantly harming retrieval effectiveness. We achieve a $2\times$ decrease in FLOPS on TopiOCQA with negligible loss in effectiveness, corresponding to a $\leq 2%$ drop in Recall@100. Our results provide insights into distillation objectives for learned sparse conversational retrievers and offer practical guidelines for improving effectiveness and efficiency in first-stage retrieval.
Abstract（参考訳）: Conversational Search (CS)は、会話コンテキストに基づく関連文書の検索を検討する。 LLM(Large Language Models)は、効率的なクエリ書き換えを可能にすることで、CSを大幅に強化した。しかし,LLMを推論に利用すると効率上の課題が生じる。効率性と効率のバランスをとる方法は、LLMベースのクエリ書き換えから知識蒸留を使用することである。最近の研究は、KLD(Kullback-Leibler Divergence)を蒸留に適用し、従来の方法と比較して教師信号との整合性を緩和している。これらの成果にもかかわらず,KLDを用いた会話検索の蒸留のいくつかの側面について検討が続けられており,本研究で検討する。関連分野における先行研究では、KLD目標に対照的な損失を加えることで、性能が向上することが示唆されている。また、KLD損失に対する対照的なサンプリング戦略は、非自明な影響があり、慎重に選択する必要があることも見出した。理論上、より多くのサンプルがKLD推定を改善することが示唆されるが、実験は使用済みサンプル数に対するリターンの低下を示す。最後に、より長い会話において、疎度が減少する現象に対処し、スパース検索法における計算効率を抑える。 KLD損失で蒸留したモデルからの表現は、正則化損失で強く正規化することができ、検索効率を著しく損なうことなく、空間性や推論効率を大幅に向上させることができる。 We achieve a $2\times$ decrease of FLOPS on TopiOCQA with negligible loss in effectiveness, corresponding to $\leq 2%$ drop in Recall@100。本研究は, 難解な会話検索者の蒸留目標に関する知見を提供し, 第一段階検索の有効性と効率を向上させるための実践的ガイドラインを提供する。

論文の概要: Improving the Efficiency and Effectiveness of LLM Knowledge Distillation for Conversational Search

関連論文リスト