Fugu-MT 論文翻訳(概要): Code as Agent Harness

論文の概要: Code as Agent Harness

arxiv url: http://arxiv.org/abs/2605.18747v1
Date: Mon, 18 May 2026 17:59:03 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-19 17:57:50.229821
Title: Code as Agent Harness
Title（参考訳）: エージェントハーネスとしてのコード
Authors: Xuying Ning, Katherine Tieu, Dongqi Fu, Tianxin Wei, Zihao Li, Yuanchen Bei, Jiaru Zou, Mengting Ai, Zhining Liu, Ting-Wei Li, Lingjie Chen, Yanjun Zhao, Ke Yang, Bingxuan Li, Cheng Qian, Gaotang Li, Xiao Lin, Zhichen Zeng, Ruizhong Qiu, Sirui Chen, Yifan Sun, Xiyuan Yang, Ruida Wang, Rui Pan, Chenyuan Yang, Dylan Zhang, Liri Fang, Zikun Cui, Yang Cao, Pan Chen, Dorothy Sun, Ren Chen, Mahesh Srinivasan, Nipun Mathur, Yinglong Xia, Hong Li, Hong Yan, Pan Lu, Lingming Zhang, Tong Zhang, Hanghang Tong, Jingrui He,
Abstract要約: 新興のエージェントシステムでは、コードはもはや単なる目標出力ではない。コードはエージェントの推論、行動、環境モデリング、実行ベースの検証のための運用上の基盤としてますます役立っている。この調査は、実行可能、検証可能、ステートフルなAIエージェントシステムに向けた統一されたロードマップを提供する。
参考スコア（独自算出の注目度）: 107.31925305395957
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent large language models (LLMs) have demonstrated strong capabilities in understanding and generating code, from competitive programming to repository-level software engineering. In emerging agentic systems, code is no longer only a target output. It increasingly serves as an operational substrate for agent reasoning, acting, environment modeling, and execution-based verification. We frame this shift through the lens of agent harnesses and introduce code as agent harness: a unified view that centers code as the basis for agent infrastructure. To systematically study this perspective, we organize the survey around three connected layers. First, we study the harness interface, where code connects agents to reasoning, action, and environment modeling. Second, we examine harness mechanisms: planning, memory, and tool use for long-horizon execution, together with feedback-driven control and optimization that make harness reliable and adaptive. Third, we discuss scaling the harness from single-agent systems to multi-agent settings, where shared code artifacts support multi-agent coordination, review, and verification. Across these layers, we summarize representative methods and practical applications of code as agent harness, spanning coding assistants, GUI/OS automation, embodied agents, scientific discovery, personalization and recommendation, DevOps, and enterprise workflows. We further outline open challenges for harness engineering, including evaluation beyond final task success, verification under incomplete feedback, regression-free harness improvement, consistent shared state across multiple agents, human oversight for safety-critical actions, and extensions to multimodal environments. By centering code as the harness of agentic AI, this survey provides a unified roadmap toward executable, verifiable, and stateful AI agent systems.
Abstract（参考訳）: 最近の大規模言語モデル(LLM)は、競合するプログラミングからリポジトリレベルのソフトウェア工学まで、コードを理解し、生成する強力な能力を示している。新興のエージェントシステムでは、コードはもはや単なる目標出力ではない。エージェント推論、アクション、環境モデリング、実行ベースの検証のための運用上の基盤として、ますます役立っている。私たちはエージェントハーネスのレンズを通してこのシフトをフレーム化し、エージェントハーネスとしてコードを導入します。この視点を体系的に研究するために、我々は3つの連結層に関する調査を組織化した。まず、コードがエージェントと推論、アクション、環境モデリングを結びつけるハーネスインターフェースについて検討する。第2に、長期実行のための計画、メモリ、ツールの使用と、信頼性と適応性を実現するフィードバック駆動制御と最適化のハーネスメカニズムについて検討する。第三に、共有コードアーティファクトがマルチエージェント調整、レビュー、検証をサポートするような、単一エージェントシステムからマルチエージェント設定へのハーネスのスケーリングについて論じる。これらのレイヤにまたがって、エージェントハーネス、コーディングアシスタント、GUI/OS自動化、エンボディエージェント、科学的発見、パーソナライズとレコメンデーション、DevOps、エンタープライズワークフローとしてコードの代表的方法と実践的応用を要約する。さらに、最終タスク成功以上の評価、不完全なフィードバックによる検証、回帰のないハーネスの改善、複数のエージェント間の一貫した共有状態、安全クリティカルな行動に対する人間の監視、マルチモーダル環境への拡張など、ハーネスエンジニアリングのオープンな課題について概説する。エージェントAIの活用としてコードを集中させることで、この調査は実行可能で検証可能でステートフルなAIエージェントシステムに向けた統一されたロードマップを提供する。

論文の概要: Code as Agent Harness

関連論文リスト