Fugu-MT 論文翻訳(概要): Log2Plan: An Adaptive GUI Automation Framework Integrated with Task Mining Approach

論文の概要: Log2Plan: An Adaptive GUI Automation Framework Integrated with Task Mining Approach

arxiv url: http://arxiv.org/abs/2509.22137v1
Date: Fri, 26 Sep 2025 09:56:44 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-29 20:57:54.35221
Title: Log2Plan: An Adaptive GUI Automation Framework Integrated with Task Mining Approach
Title（参考訳）: Log2Plan:タスクマイニングアプローチを統合したアダプティブGUI自動化フレームワーク
Authors: Seoyoung Lee, Seonbin Yoon, Seongbeen Lee, Hyesoo Kim, Joo Yong Sim,
Abstract要約: 既存のVLMベースのプランナー・エグゼクタエージェントは、不安定な一般化、高いレイテンシ、限られた長距離コヒーレンスに悩まされている。 Log2Planは、構造化された2段階の計画フレームワークと、ユーザの振る舞いログに対するタスクマイニングアプローチを組み合わせることで、これらの制限に対処する。実世界のタスク200件についてLog2Planを評価し,タスク成功率と実行時間を大幅に改善した。
参考スコア（独自算出の注目度）: 1.7970227672578558
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: GUI task automation streamlines repetitive tasks, but existing LLM or VLM-based planner-executor agents suffer from brittle generalization, high latency, and limited long-horizon coherence. Their reliance on single-shot reasoning or static plans makes them fragile under UI changes or complex tasks. Log2Plan addresses these limitations by combining a structured two-level planning framework with a task mining approach over user behavior logs, enabling robust and adaptable GUI automation. Log2Plan constructs high-level plans by mapping user commands to a structured task dictionary, enabling consistent and generalizable automation. To support personalization and reuse, it employs a task mining approach from user behavior logs that identifies user-specific patterns. These high-level plans are then grounded into low-level action sequences by interpreting real-time GUI context, ensuring robust execution across varying interfaces. We evaluated Log2Plan on 200 real-world tasks, demonstrating significant improvements in task success rate and execution time. Notably, it maintains over 60.0% success rate even on long-horizon task sequences, highlighting its robustness in complex, multi-step workflows.
Abstract（参考訳）: GUIタスクの自動化は反復的なタスクを効率化するが、既存のLLMまたはVLMベースのプランナー実行エージェントは、不安定な一般化、高いレイテンシ、限られた長距離コヒーレンスに悩まされている。シングルショットの推論や静的な計画に依存しているため、UIの変更や複雑なタスク下では脆弱である。 Log2Planは、構造化された2段階の計画フレームワークとユーザ動作ログ上のタスクマイニングアプローチを組み合わせて、堅牢で適応可能なGUI自動化を実現することで、これらの制限に対処する。 Log2Planは、ユーザコマンドを構造化されたタスク辞書にマッピングすることで、一貫性と一般化可能な自動化を可能にする。パーソナライズと再利用をサポートするため、ユーザ固有のパターンを識別するユーザ行動ログからタスクマイニングアプローチを採用している。これらのハイレベルプランは、リアルタイムGUIコンテキストを解釈し、さまざまなインターフェース間で堅牢な実行を保証することで、低レベルのアクションシーケンスに基礎を置いている。実世界のタスク200件についてLog2Planを評価し,タスク成功率と実行時間を大幅に改善した。特に、長期タスクシーケンスでも60.0%以上の成功率を維持しており、複雑なマルチステップワークフローにおける堅牢性を強調している。

論文の概要: Log2Plan: An Adaptive GUI Automation Framework Integrated with Task Mining Approach

関連論文リスト