Fugu-MT 論文翻訳(概要): GUI-AC: Enhancing Continual Learning in GUI Agents

論文の概要: GUI-AC: Enhancing Continual Learning in GUI Agents

arxiv url: http://arxiv.org/abs/2606.10522v1
Date: Tue, 09 Jun 2026 07:52:10 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-10 15:40:58.374204
Title: GUI-AC: Enhancing Continual Learning in GUI Agents
Title（参考訳）: GUI-AC: GUIエージェントの継続的な学習を促進する
Authors: Can Lin, Tao Feng, Hangjie Yuan, Dan Zhang, Yifan Zhu, Zhonghong Ou,
Abstract要約: 補強微細調整(RFT)はその接地能力において顕著な不安定性を示す。 GUIエージェントの連続学習能力を向上するGUI-ACを提案する。
参考スコア（独自算出の注目度）: 20.919614781710468
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Graphical User Interfaces (GUIs) serve as the dominant medium for human-computer interaction, yet building GUI agents that generalize across the vast diversity of real-world interface environments, with the same flexibility and robustness that humans naturally exhibit, remains unsolved. Notably, GUI data are inherently non-stationary: the continual emergence of previously unseen interface instances (e.g., novel domains and resolutions) induces persistent distribution shifts, significantly impeding the continual learning of existing GUI agents. Reinforcement fine-tuning (RFT) has attracted considerable attention as a promising approach. Nevertheless, RFT exhibits pronounced instability in its grounding capability, manifested as sharp reward discontinuities and high-variance oscillations. The imbalanced distribution of rollout outcomes introduces substantial noise into advantage estimation, leading to policy overconfidence. The fixed clipping bound suppresses the increase in policy probabilities needed to adapt to new distributions, leading to a collapse in exploration capacity. To address these challenges, we propose GUI-AC, a method that enhances the continual learning capability of GUI agents. GUI-AC introduces grounding certainty to support two core mechanisms: (i) Adaptive Advantage, which down-weights noisy advantage estimates to prevent policy overconfidence; and (ii) Dynamic Clipping, which relaxes the clipping bound to encourage exploration range. Extensive experiments show that these mechanisms jointly improve performance, enabling our method to surpass state-of-the-art baselines. Code is available anonymously at https://anonymous.4open.science/r/GUI-AC.
Abstract（参考訳）: グラフィカル・ユーザ・インタフェース(GUI)は、人間とコンピュータのインタラクションにおいて支配的な媒体であるが、人間が自然に提示する柔軟性と堅牢さを兼ね備えた、現実世界のインターフェース環境の幅広い多様性を一般化するGUIエージェントの構築は未解決のままである。特に、GUIデータは本質的に非定常的であり、以前は目に見えないインターフェースインスタンス(例えば、新しいドメインや解像度)の連続的な出現は、永続的な分散シフトを誘発し、既存のGUIエージェントの継続的な学習を著しく阻害する。強化微調整(RFT)は有望なアプローチとして注目されている。にもかかわらず、RFTはその接地能力において顕著な不安定性を示し、鋭い報酬の不連続性と高分散振動として表される。ロールアウト結果の不均衡分布は、かなりのノイズを利点推定に導入し、政策の過信につながる。固定クリッピング境界は、新しい分布に適応するために必要な政策確率の増加を抑制し、探査能力の崩壊につながる。これらの課題に対処するため,GUIエージェントの継続的な学習能力を高めるGUI-ACを提案する。 GUI-ACが2つのコアメカニズムをサポートするための基盤的確実性を導入一政策過信を防ぐために見積を有利に活用する適応アドバンテージ (ii)ダイナミッククリッピング(Dynamic Clipping)は、探索範囲を奨励するためにクリッピングを緩和する。実験の結果,これらのメカニズムは共同で性能を向上し,その手法が最先端のベースラインを超越できることが判明した。コードはhttps://anonymous.4open.science/r/GUI-ACで公開されている。

論文の概要: GUI-AC: Enhancing Continual Learning in GUI Agents

関連論文リスト