Fugu-MT 論文翻訳(概要): From User Interface to Agent Interface: Efficiency Optimization of UI Representations for LLM Agents

論文の概要: From User Interface to Agent Interface: Efficiency Optimization of UI Representations for LLM Agents

arxiv url: http://arxiv.org/abs/2512.13438v1
Date: Mon, 15 Dec 2025 15:34:06 GMT
ステータス: 翻訳完了
システム内更新日: 2025-12-16 17:54:56.725549
Title: From User Interface to Agent Interface: Efficiency Optimization of UI Representations for LLM Agents
Title（参考訳）: ユーザインタフェースからエージェントインターフェースへ: LLMエージェントのUI表現の効率最適化
Authors: Dezhi Ran, Zhi Gong, Yuzhe Guo, Mengzhou Wu, Yuan Cao, Haochuan Lu, Hengyu Zhang, Xia Zeng, Gang Cao, Liangchao Yao, Yuetang Deng, Wei Yang, Tao Xie,
Abstract要約: 大きな言語モデル(LLM)エージェントは、自動UIテストやAIアシスタントなど、UIナビゲーションの自動化に大きな可能性を示している。大規模言語モデル(LLM)エージェントは、自動UIテストやAIアシスタントなど、自動UIナビゲーションの優れた可能性を示しているが、その効率はほとんど見過ごされている。制約ベースの最適化によってUI変換プログラムを合成する,最初の自動最適化フレームワークであるUIFormerを紹介する。
参考スコア（独自算出の注目度）: 21.811753076804944
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: While Large Language Model (LLM) agents show great potential for automated UI navigation such as automated UI testing and AI assistants, their efficiency has been largely overlooked. Our motivating study reveals that inefficient UI representation creates a critical performance bottleneck. However, UI representation optimization, formulated as the task of automatically generating programs that transform UI representations, faces two unique challenges. First, the lack of Boolean oracles, which traditional program synthesis uses to decisively validate semantic correctness, poses a fundamental challenge to co-optimization of token efficiency and completeness. Second, the need to process large, complex UI trees as input while generating long, compositional transformation programs, making the search space vast and error-prone. Toward addressing the preceding limitations, we present UIFormer, the first automated optimization framework that synthesizes UI transformation programs by conducting constraint-based optimization with structured decomposition of the complex synthesis task. First, UIFormer restricts the program space using a domain-specific language (DSL) that captures UI-specific operations. Second, UIFormer conducts LLM-based iterative refinement with correctness and efficiency rewards, providing guidance for achieving the efficiency-completeness co-optimization. UIFormer operates as a lightweight plugin that applies transformation programs for seamless integration with existing LLM agents, requiring minimal modifications to their core logic. Evaluations across three UI navigation benchmarks spanning Android and Web platforms with five LLMs demonstrate that UIFormer achieves 48.7% to 55.8% token reduction with minimal runtime overhead while maintaining or improving agent performance. Real-world industry deployment at WeChat further validates the practical impact of UIFormer.
Abstract（参考訳）: 大規模言語モデル(LLM)エージェントは、自動UIテストやAIアシスタントなど、自動UIナビゲーションの優れた可能性を示しているが、その効率はほとんど見過ごされている。私たちのモチベーション調査では、非効率なUI表現が重要なパフォーマンスボトルネックを生み出します。しかし、UI表現の最適化は、UI表現を変換するプログラムを自動生成するタスクとして定式化され、2つの固有の課題に直面している。第一に、従来のプログラム合成が意味的正当性を決定的に検証するために使用しているブールオラクルの欠如は、トークン効率と完全性の共同最適化に根本的な課題をもたらす。第二に、大規模な複雑なUIツリーを入力として処理し、長い構成変換プログラムを生成して、検索スペースを大きくし、エラーを発生させる必要がある。上記の制限に対処するために,UIFormerを提案する。UIFormerは,複雑な合成タスクの構造的分解を伴う制約ベースの最適化を実行することで,UI変換プログラムを合成する最初の自動最適化フレームワークである。まず、UIFormerは、UI固有の操作をキャプチャするドメイン固有言語(DSL)を使用して、プログラム空間を制限する。第二に、UIFormer は LLM ベースの反復的な改善を正しさと効率の報酬で実施し、効率-完全性共最適化を達成するためのガイダンスを提供する。 UIFormerは、既存のLLMエージェントとのシームレスな統合のために変換プログラムを適用する軽量なプラグインとして動作し、コアロジックの変更を最小限に抑える必要がある。 AndroidとWebプラットフォームにまたがる3つのUIナビゲーションベンチマークと5つのLCMによる評価は、UIFormerが48.7%から55.8%のトークン削減を実現し、実行時のオーバーヘッドを最小限に抑えながら、エージェントのパフォーマンスを維持または改善していることを示している。 WeChatの実際の業界展開は、UIFormerの実践的影響をさらに検証している。

論文の概要: From User Interface to Agent Interface: Efficiency Optimization of UI Representations for LLM Agents

関連論文リスト