Fugu-MT 論文翻訳(概要): Towards Robust Real-World Spreadsheet Understanding with Multi-Agent Multi-Format Reasoning

論文の概要: Towards Robust Real-World Spreadsheet Understanding with Multi-Agent Multi-Format Reasoning

arxiv url: http://arxiv.org/abs/2604.12282v1
Date: Tue, 14 Apr 2026 04:47:21 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-15 19:11:32.245013
Title: Towards Robust Real-World Spreadsheet Understanding with Multi-Agent Multi-Format Reasoning
Title（参考訳）: マルチエージェントマルチフォーム推論によるロバスト実世界スプレッドシート理解に向けて
Authors: Houxing Ren, Mingjie Zhan, Zimu Lu, Ke Wang, Yunqiao Yang, Haotian Hou, Hongsheng Li,
Abstract要約: スプレッドシートは、企業報告、監査、科学データ管理といった現実世界のアプリケーションの中心である。既存の大きな言語モデルベースのアプローチでは、テーブルを平易なテキストとして扱い、重要なレイアウトキューや視覚的意味論を見渡すのが一般的である。本稿では,ステップバイステップの読み出しと推論のパラダイムを取り入れた,スプレッドシート理解のための2段階のマルチエージェントフレームワークを提案する。
参考スコア（独自算出の注目度）: 43.91509663025854
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Spreadsheets are central to real-world applications such as enterprise reporting, auditing, and scientific data management. Despite their ubiquity, existing large language model based approaches typically treat tables as plain text, overlooking critical layout cues and visual semantics. Moreover, real-world spreadsheets are often massive in scale, exceeding the input length that LLMs can efficiently process. To address these challenges, we propose SpreadsheetAgent, a two-stage multi-agent framework for spreadsheet understanding that adopts a step-by-step reading and reasoning paradigm. Instead of loading the entire spreadsheet at once, SpreadsheetAgent incrementally interprets localized regions through multiple modalities, including code execution results, images, and LaTeX tables. The method first constructs a structural sketch and row/column summaries, and then performs task-driven reasoning over this intermediate representation in the Solving Stage. To further enhance reliability, we design a verification module that validates extracted structures via targeted inspections, reducing error propagation and ensuring trustworthy inputs for downstream reasoning. Extensive experiments on two spreadsheet datasets demonstrate the effectiveness of our approach. With GPT-OSS-120B, SpreadsheetAgent achieves 38.16% on Spreadsheet Bench, outperforming the ChatGPT Agent baseline (35.27%) by 2.89 absolute points. These results highlight the potential of SpreadsheetAgent to advance robust and scalable spreadsheet understanding in real-world applications. Code is available at https://github.com/renhouxing/SpreadsheetAgent.git.
Abstract（参考訳）: スプレッドシートは、企業報告、監査、科学データ管理といった現実世界のアプリケーションの中心である。汎用性にもかかわらず、既存の大規模言語モデルベースのアプローチでは、テーブルを平易なテキストとして扱い、重要なレイアウトキューや視覚的意味論を見渡すのが一般的である。さらに、現実世界のスプレッドシートは大規模であり、LLMが効率的に処理できる入力長を超えることが多い。これらの課題に対処するために,スプレッドシート理解のための2段階のマルチエージェントフレームワークであるSpreadsheetAgentを提案する。 SpreadsheetAgentは,スプレッドシート全体を一度にロードする代わりに,コード実行結果やイメージ,LaTeXテーブルなど,複数のモダリティを通じて,ローカライズされたリージョンを段階的に解釈する。この手法はまず構造スケッチと行/列の要約を構築し,その中間表現に対するタスク駆動推論を行う。信頼性を高めるため,ターゲット検査による抽出構造の評価,誤り伝播の低減,下流推論のための信頼性の高い入力を保証する検証モジュールを設計した。 2つのスプレッドシートデータセットに対する大規模な実験は、我々のアプローチの有効性を実証する。 GPT-OSS-120Bでは、SpreadsheetAgentはSpreadsheet Benchで38.16%を獲得し、ChatGPT Agentベースライン(35.27%)を2.89絶対点で上回った。これらの結果は、現実世界のアプリケーションで堅牢でスケーラブルなスプレッドシート理解を促進するためのSpreadsheetAgentの可能性を強調している。コードはhttps://github.com/renhouxing/SpreadsheetAgent.gitで入手できる。

論文の概要: Towards Robust Real-World Spreadsheet Understanding with Multi-Agent Multi-Format Reasoning

関連論文リスト