Fugu-MT 論文翻訳(概要): Lita: Light Agent Uncovers the Agentic Coding Capabilities of LLMs

論文の概要: Lita: Light Agent Uncovers the Agentic Coding Capabilities of LLMs

arxiv url: http://arxiv.org/abs/2509.25873v1
Date: Tue, 30 Sep 2025 07:07:32 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-01 17:09:04.462966
Title: Lita: Light Agent Uncovers the Agentic Coding Capabilities of LLMs
Title（参考訳）: LLMのエージェント・コーディング能力、ライト・エージェントが発見
Authors: Hankun Dai, Maoquan Wang, Mengnan Qi, Yikai Zhang, Zijian Jin, Yongqiang Yao, Yufan Huang, Shengyu Fu, Elsie Nallipogu,
Abstract要約: 完全自律エージェントの本質的要素を維持しつつ手動設計を最小化するための原則である、エレガントさを運用するLitaを紹介する。 Aider PolyglotとSWE-Benchをフロンティアモデルで実験したところ、Litaはワークフローベースのベースラインやエージェントベースのベースラインと比較して、競争力や優れたパフォーマンスを実現している。
参考スコア（独自算出の注目度）: 8.104616255794323
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models (LLMs) are increasingly being applied to programming tasks, ranging from single-turn code completion to autonomous agents. Current code agent designs frequently depend on complex, hand-crafted workflows and tool sets. However, this reliance on elaborate scaffolding presents several challenges: agent performance becomes overly dependent on prompt tuning and custom design choices, heavy human intervention obscures a model's true underlying capabilities, and intricate pipelines are costly to build and maintain. Furthermore, optimizing complex task prompts increases the risk of data leakage. Currently, when introducing new models, LLM providers like OpenAI and Anthropic often publish benchmark scores to demonstrate their models' coding proficiency, but keep their proprietary evaluation frameworks confidential. To address these limitations, we introduce Lita (Lite Agent), which operationalizes liteness, a principle of minimizing manual design while retaining the essential elements of a fully autonomous agent. Lita enables a more faithful and unified evaluation without elaborate scaffolding. Experiments on the Aider Polyglot and SWE-Bench with frontier models demonstrate that Lita achieves competitive or superior performance compared to workflow-based and agentic baselines. Crucially, Lita also consumes fewer tokens and requires significantly less design effort. Our results suggest that Lita is sufficient to reveal the underlying coding competence of modern LLMs. Finally, we propose the Agent Complexity Law: the performance gap between agents of varying complexity, from simple to sophisticated designs, will shrink as the core model improves, ultimately converging to a negligible difference.
Abstract（参考訳）: 大規模言語モデル(LLM)は、シングルターンコード補完から自律エージェントまで、プログラミングタスクにますます適用されている。現在のコードエージェントの設計は、複雑で手作りのワークフローやツールセットに依存することが多い。エージェントのパフォーマンスは、迅速なチューニングとカスタムデザインの選択に過度に依存するようになり、重い人間の介入は、モデルの真の基盤となる能力を曖昧にし、複雑なパイプラインを構築し維持するのにコストがかかります。さらに、複雑なタスクの最適化は、データ漏洩のリスクを高める。現在、新しいモデルを導入する際、OpenAIやAnthropicのようなLLMプロバイダは、モデルのコーディング能力を示すベンチマークスコアをしばしば公開しているが、プロプライエタリな評価フレームワークは秘密にしている。これらの制約に対処するため,完全自律エージェントの本質的要素を維持しつつ手動設計を最小化するリタ(ライトエージェント)を導入する。 Litaは、精巧な足場なしでより忠実で統一された評価を可能にする。 Aider PolyglotとSWE-Benchをフロンティアモデルで実験したところ、Litaはワークフローベースのベースラインやエージェントベースのベースラインと比較して、競争力や優れたパフォーマンスを実現している。重要な点として、Litaはトークンの消費を減らし、設計の労力を大幅に削減する。この結果から,現代のLLMのコーディング能力を明らかにするには,Litaが十分であることが示唆された。最後に,エージェント複雑度法(Agen Complexity Law)を提案する。単純な設計から洗練された設計まで,さまざまな複雑度を持つエージェントのパフォーマンスギャップは,コアモデルの改善に伴って縮小し,最終的には無視可能な相違へと収束する。

論文の概要: Lita: Light Agent Uncovers the Agentic Coding Capabilities of LLMs

関連論文リスト