Related papers: SEAgent: Self-Evolving Computer Use Agent with Autonomous Learning from Experience

SEAgent: Self-Evolving Computer Use Agent with Autonomous Learning from Experience

URL: http://arxiv.org/abs/2508.04700v2
Date: Tue, 12 Aug 2025 15:11:53 GMT
Title: SEAgent: Self-Evolving Computer Use Agent with Autonomous Learning from Experience
Authors: Zeyi Sun, Ziyu Liu, Yuhang Zang, Yuhang Cao, Xiaoyi Dong, Tong Wu, Dahua Lin, Jiaqi Wang,
Abstract summary: We propose SEAgent, an agentic self-evolving framework enabling computer-use agents to evolve through interactions with unfamiliar software.<n>We validate the effectiveness of SEAgent across five novel software environments within OS-World.<n>Our approach achieves a significant improvement of 23.2% in success rate, from 11.3% to 34.5%, over a competitive open-source CUA.
Score: 71.82719117238307
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Repurposing large vision-language models (LVLMs) as computer use agents (CUAs) has led to substantial breakthroughs, primarily driven by human-labeled data. However, these models often struggle with novel and specialized software, particularly in scenarios lacking human annotations. To address this challenge, we propose SEAgent, an agentic self-evolving framework enabling CUAs to autonomously evolve through interactions with unfamiliar software. Specifically, SEAgent empowers computer-use agents to autonomously master novel software environments via experiential learning, where agents explore new software, learn through iterative trial-and-error, and progressively tackle auto-generated tasks organized from simple to complex. To achieve this goal, we design a World State Model for step-wise trajectory assessment, along with a Curriculum Generator that generates increasingly diverse and challenging tasks. The agent's policy is updated through experiential learning, comprised of adversarial imitation of failure actions and Group Relative Policy Optimization (GRPO) on successful ones. Furthermore, we introduce a specialist-to-generalist training strategy that integrates individual experiential insights from specialist agents, facilitating the development of a stronger generalist CUA capable of continuous autonomous evolution. This unified agent ultimately achieves performance surpassing ensembles of individual specialist agents on their specialized software. We validate the effectiveness of SEAgent across five novel software environments within OS-World. Our approach achieves a significant improvement of 23.2% in success rate, from 11.3% to 34.5%, over a competitive open-source CUA, i.e., UI-TARS.

Related papers

EvoCUA: Evolving Computer Use Agents via Learning from Scalable Synthetic Experience [44.734653745434834]
We introduce EvoCUA, a native computer use agentic model.<n>Unlike static imitation, EvoCUA integrates data generation and policy optimization into a self-sustaining evolutionary cycle.<n>EvoCUA significantly outperforms the previous best open-source model, OpenCUA-72B.
arXiv Detail & Related papers (2026-01-22T11:36:43Z)
Robust Agents in Open-Ended Worlds [4.199586801784625]
In this thesis, we harness methodologies from open-endedness and multi-agent learning to train and evaluate robust AI agents.<n>We begin by introducing MiniHack, a sandbox framework for creating diverse environments through procedural content generation.<n>We then present Maestro, a novel approach for generating adversarial curricula that progressively enhance the robustness and generality of RL agents in two-player zero-sum games.
arXiv Detail & Related papers (2025-12-09T00:30:33Z)
Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning [84.70211451226835]
Large Language Model (LLM) Agents are constrained by a dependency on human-curated data.<n>We introduce Agent0, a fully autonomous framework that evolves high-performing agents without external data.<n>Agent0 substantially boosts reasoning capabilities, improving the Qwen3-8B-Base model by 18% on mathematical reasoning and 24% on general reasoning benchmarks.
arXiv Detail & Related papers (2025-11-20T05:01:57Z)
Agentsway -- Software Development Methodology for AI Agents-based Teams [4.226647687395254]
"Agentsway" is a novel software development framework designed for ecosystems where AI agents operate as first-class collaborators.<n>The framework defines distinct roles for planning, prompting, coding, testing, and fine-tuning agents.<n>Agentsway represents a foundational step toward the next generation of AI-native, self-improving software development methodologies.
arXiv Detail & Related papers (2025-10-26T11:58:42Z)
From Agentification to Self-Evolving Agentic AI for Wireless Networks: Concepts, Approaches, and Future Research Directions [70.72279728350763]
Self-evolving agentic artificial intelligence (AI) offers a new paradigm for future wireless systems.<n>Unlike static AI models, self-evolving agents embed an autonomous evolution cycle that updates models, tools, and in response to environmental dynamics.<n>This paper presents a comprehensive overview of self-evolving agentic AI, highlighting its layered architecture, life cycle, and key techniques.
arXiv Detail & Related papers (2025-10-07T05:45:25Z)
LIMI: Less is More for Agency [49.63355240818081]
LIMI (Less Is More for Intelligent Agency) demonstrates that agency follows radically different development principles.<n>We show that sophisticated agentic intelligence can emerge from minimal but strategically curated demonstrations of autonomous behavior.<n>Our findings establish the Agency Efficiency Principle: machine autonomy emerges not from data abundance but from strategic curation of high-quality agentic demonstrations.
arXiv Detail & Related papers (2025-09-22T10:59:32Z)
Scaling Agents via Continual Pre-training [80.97989245493326]
We propose incorporating Agentic Continual Pre-training (Agentic CPT) into the deep research agents training pipeline to build powerful agentic foundational models.<n>We evaluate our AgentFounder-30B on 10 benchmarks and achieve state-of-the-art performance while retains strong tool-use ability.
arXiv Detail & Related papers (2025-09-16T17:57:19Z)
SEA: Self-Evolution Agent with Step-wise Reward for Computer Use [6.056153018209402]
We propose the Self-Evolution Agent (SEA) for computer use, and to develop this agent, we propose creative methods in data generation, reinforcement learning, and model enhancement.<n>Based on our proposed innovation of data generation, training strategy, and enhancement, we get the Selfevolution Agent (SEA) for computer use with only 7B parameters.
arXiv Detail & Related papers (2025-08-06T02:57:22Z)
UI-Evol: Automatic Knowledge Evolving for Computer Use Agents [19.978272700123004]
We propose UI-Evol, a plug-and-play module for autonomous GUI knowledge evolution.<n> UI-Evol consists of two stages: a Retrace Stage that extracts faithful objective action sequences from actual agent-environment interactions, and a Critique Stage that refines existing knowledge by comparing these sequences against external references.<n>Our results demonstrate that UI-Evol not only significantly boosts task performance but also addresses a previously overlooked issue of high behavioral standard deviation in computer use agents.
arXiv Detail & Related papers (2025-05-28T04:32:05Z)
From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning [62.54484062185869]
We introduce StepAgent, which utilizes step-wise reward to optimize the agent's reinforcement learning process.<n>We propose implicit-reward and inverse reinforcement learning techniques to facilitate agent reflection and policy adjustment.
arXiv Detail & Related papers (2024-11-06T10:35:11Z)
Agent S: An Open Agentic Framework that Uses Computers Like a Human [31.16046798529319]
We present Agent S, an open agentic framework that enables autonomous interaction with computers through a Graphical User Interface (GUI) Agent S aims to address three key challenges in automating computer tasks: acquiring domain-specific knowledge, planning over long task horizons, and handling dynamic, non-uniform interfaces.
arXiv Detail & Related papers (2024-10-10T17:43:51Z)
Agent Q: Advanced Reasoning and Learning for Autonomous AI Agents [44.34340798542]
Large Language Models (LLMs) have shown remarkable capabilities in natural language tasks requiring complex reasoning. Traditional supervised pre-training on static datasets falls short in enabling autonomous agent capabilities. We propose a framework that combines guided Monte Carlo Tree Search (MCTS) search with a self-critique mechanism and iterative fine-tuning on agent interactions.
arXiv Detail & Related papers (2024-08-13T20:52:13Z)
SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering [79.07755560048388]
SWE-agent is a system that facilitates LM agents to autonomously use computers to solve software engineering tasks. SWE-agent's custom agent-computer interface (ACI) significantly enhances an agent's ability to create and edit code files, navigate entire repositories, and execute tests and other programs. We evaluate SWE-agent on SWE-bench and HumanEvalFix, achieving state-of-the-art performance on both with a pass@1 rate of 12.5% and 87.7%, respectively.
arXiv Detail & Related papers (2024-05-06T17:41:33Z)
Investigate-Consolidate-Exploit: A General Strategy for Inter-Task Agent Self-Evolution [92.84441068115517]
Investigate-Consolidate-Exploit (ICE) is a novel strategy for enhancing the adaptability and flexibility of AI agents. ICE promotes the transfer of knowledge between tasks for genuine self-evolution. Our experiments on the XAgent framework demonstrate ICE's effectiveness, reducing API calls by as much as 80%.
arXiv Detail & Related papers (2024-01-25T07:47:49Z)
Experiential Co-Learning of Software-Developing Agents [83.34027623428096]
Large language models (LLMs) have brought significant changes to various domains, especially in software development. We introduce Experiential Co-Learning, a novel LLM-agent learning framework. Experiments demonstrate that the framework enables agents to tackle unseen software-developing tasks more effectively.
arXiv Detail & Related papers (2023-12-28T13:50:42Z)
ProAgent: Building Proactive Cooperative Agents with Large Language Models [89.53040828210945]
ProAgent is a novel framework that harnesses large language models to create proactive agents. ProAgent can analyze the present state, and infer the intentions of teammates from observations. ProAgent exhibits a high degree of modularity and interpretability, making it easily integrated into various coordination scenarios.
arXiv Detail & Related papers (2023-08-22T10:36:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.