Fugu-MT 論文翻訳(概要): UIS-Digger: Towards Comprehensive Research Agent Systems for Real-world Unindexed Information Seeking

論文の概要: UIS-Digger: Towards Comprehensive Research Agent Systems for Real-world Unindexed Information Seeking

arxiv url: http://arxiv.org/abs/2603.08117v2
Date: Wed, 11 Mar 2026 12:51:49 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-12 14:12:44.150054
Title: UIS-Digger: Towards Comprehensive Research Agent Systems for Real-world Unindexed Information Seeking
Title（参考訳）: UIS-Digger: リアルタイム情報検索のための総合的な研究エージェントシステムを目指して
Authors: Chang Liu, Chuqiao Kuang, Tianyi Zhuang, Yuxin Cheng, Huichi Zhou, Xiaoguang Li, Lifeng Shang,
Abstract要約: Unindexed Information Seeking (UIS) は、サーチエンジンクローラーによって重要な情報が取得されない場所である。 110のエキスパートアノテートされたQAペアからなるUISベンチマークであるUIS-QAを紹介する。デュアルモードブラウジングと同時Webページ検索とファイル解析を可能にする新しいマルチエージェントフレームワークであるUIS-Diggerを提案する。
参考スコア（独自算出の注目度）: 34.4829016432132
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Recent advancements in LLM-based information-seeking agents have achieved record-breaking performance on established benchmarks. However, these agents remain heavily reliant on search-engine-indexed knowledge, leaving a critical blind spot: Unindexed Information Seeking (UIS). This paper identifies and explores the UIS problem, where vital information is not captured by search engine crawlers, such as overlooked content, dynamic webpages, and embedded files. Despite its significance, UIS remains an underexplored challenge. To address this gap, we introduce UIS-QA, the first dedicated UIS benchmark, comprising 110 expert-annotated QA pairs. Notably, even state-of-the-art agents experience a drastic performance drop on UIS-QA (e.g., from 70.90 on GAIA and 46.70 on BrowseComp-zh to 24.55 on UIS-QA), underscoring the severity of the problem. To mitigate this, we propose UIS-Digger, a novel multi-agent framework that incorporates dual-mode browsing and enables simultaneous webpage searching and file parsing. With a relatively small $\sim$30B-parameter backbone LLM optimized using SFT and RFT training strategies, UIS-Digger sets a strong baseline at 27.27\%, outperforming systems integrating sophisticated LLMs such as O3 and GPT-4.1. This demonstrates the importance of proactive interaction with unindexed sources for effective and comprehensive information-seeking. Our work not only uncovers a fundamental limitation in current agent evaluation paradigms but also provides the first toolkit for advancing UIS research, defining a new and promising direction for robust information-seeking systems.
Abstract（参考訳）: LLMに基づく情報探索エージェントの最近の進歩は、確立されたベンチマークで記録破りのパフォーマンスを達成した。しかし、これらのエージェントは、サーチエンジンにインデクシングされた知識に大きく依存しており、重要な盲点を残している: インデクシングされていない情報シーキング(UIS)。本稿では,見過ごされたコンテンツや動的Webページ,組み込みファイルなど,検索エンジンクローラによって重要な情報が収集されないUIS問題を特定し,検討する。その重要性にもかかわらず、UISは未調査の課題である。このギャップに対処するために、110のエキスパートアノテートされたQAペアからなる最初のUISベンチマークであるUIS-QAを紹介する。特に、最先端のエージェントでさえUIS-QA(例えばGAIAで70.90、BrowseComp-zhで46.70からUIS-QAで24.55まで)の大幅なパフォーマンス低下を経験しており、この問題の深刻さを裏付けている。これを軽減するために、UIS-Diggerを提案する。UIS-Diggerは、デュアルモードブラウジングと同時Webページ検索とファイル解析を可能にする新しいマルチエージェントフレームワークである。 SFTとRFTのトレーニング戦略を使って最適化された比較的小さな$\sim$30BのバックボーンLLMで、UIS-Diggerは強力なベースラインを27.27\%に設定し、O3やGPT-4.1のような洗練されたLLMを統合するシステムよりも優れた性能を実現した。このことは、効果的で包括的な情報探索のために、インデックスのないソースとのプロアクティブな相互作用の重要性を示している。我々の研究は、現在のエージェント評価パラダイムの根本的な限界を明らかにするだけでなく、UIS研究を進めるための最初のツールキットも提供します。

論文の概要: UIS-Digger: Towards Comprehensive Research Agent Systems for Real-world Unindexed Information Seeking

関連論文リスト