Fugu-MT 論文翻訳(概要): LLM-as-an-Investigator: Evidence-First Reasoning for Robust Interactive Problem Diagnosis

論文の概要: LLM-as-an-Investigator: Evidence-First Reasoning for Robust Interactive Problem Diagnosis

arxiv url: http://arxiv.org/abs/2606.13220v1
Date: Thu, 11 Jun 2026 11:37:07 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-12 15:55:27.758095
Title: LLM-as-an-Investigator: Evidence-First Reasoning for Robust Interactive Problem Diagnosis
Title（参考訳）: LLM-as-an-investigator:ロバストな対話型問題診断のためのエビデンスファースト推論
Authors: Fabrizio Marozzo, Pietro Liò,
Abstract要約: 本稿では、ロバストな問題診断のためのエビデンスファーストのエージェントAI手法であるLSM-as-an-Investigatorを紹介する。このアプローチは、初期問題記述の曖昧さを見積もるソリューション調査エージェントによって実装される。その結果,提案手法は直接的プロンプトや推論のみのベースラインよりも精度が高いことがわかった。
参考スコア（独自算出の注目度）: 13.258011377627822
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models (LLMs) are increasingly used as interactive assistants for technical problem solving. However, when users provide incomplete descriptions or plausible but unverified explanations, LLMs may prematurely align with these assumptions and propose solutions before collecting sufficient evidence. We refer to this behavior as user-driven sycophancy: the tendency of an LLM to reinforce a user-provided hypothesis instead of testing alternative explanations. This paper introduces LLM-as-an-Investigator, an evidence-first agentic AI methodology for robust problem diagnosis. The approach is implemented through a Solution Investigator Agent, which estimates the ambiguity of an initial problem description, generates candidate hypotheses, asks targeted clarification questions, and updates hypothesis probabilities after each answer. Rather than producing an immediate response, the agent continues the investigation until the evidence makes one candidate explanation stronger than the alternatives. To evaluate the approach, we build a benchmark from solved technical forum threads in mechanical, electrical, and hydraulic domains. We use a three-agent evaluation pipeline in which a Problem-Solution Extractor Agent converts solved threads into structured cases, a Ground-Truth Evaluator Agent simulates the user while hiding the known solution, and the tested assistant attempts to recover the solution through dialogue. The experiments compare standard assistants, reasoning-oriented LLMs, and the proposed investigator-based model across LLM backbones. In addition to diagnostic accuracy, we analyze how standard assistants follow misleading user hypotheses in diagnostic cases. The results show that the proposed approach identifies the problem more accurately than direct prompting and reasoning-only baselines, while its evidence-first protocol helps reduce user-induced conversational bias.
Abstract（参考訳）: 大規模言語モデル(LLM)は、技術的問題解決のための対話型アシスタントとしてますます使われている。しかし、ユーザが不完全な説明や検証不可能な説明を提供する場合、LCMはこれらの仮定に早急に一致し、十分な証拠を収集する前に解決策を提案する。我々は、この振る舞いをユーザ主導の梅毒(英語版)と呼び、LCMは代替説明をテストするのではなく、ユーザが提供する仮説を補強する傾向にある。本稿では、ロバストな問題診断のためのエビデンスファーストのエージェントAI手法であるLSM-as-an-Investigatorを紹介する。このアプローチは、初期問題記述の曖昧さを推定し、候補仮説を生成し、対象とする明確化の質問を行い、各回答の後に仮説の確率を更新するソリューション調査エージェントによって実現される。エージェントは即時応答を生成するのではなく、証拠が1つの候補説明を他の候補よりも強くするまで調査を続ける。提案手法を評価するため,機械,電気,油圧の領域で解決された技術フォーラムスレッドからベンチマークを構築した。問題解決エクストラクタエージェントが解決したスレッドを構造化されたケースに変換する3エージェント評価パイプラインを使用し、グラウントトラス評価エージェントは既知のソリューションを隠蔽しながらユーザをシミュレートし、テストされたアシスタントは対話を通して解を回復しようとする。実験は、標準アシスタント、推論指向LLM、およびLLMバックボーン間の探索モデルと比較した。診断精度に加えて,診断症例において,標準アシスタントが誤ったユーザ仮説に従う方法を分析する。その結果,提案手法は直接的プロンプトや推論のみのベースラインよりも精度が高く,エビデンス優先プロトコルはユーザによる会話バイアスの低減に役立つことがわかった。

論文の概要: LLM-as-an-Investigator: Evidence-First Reasoning for Robust Interactive Problem Diagnosis

関連論文リスト