Related papers: SkillWeaver: Web Agents can Self-Improve by Discovering and Honing Skills

SkillWeaver: Web Agents can Self-Improve by Discovering and Honing Skills

URL: http://arxiv.org/abs/2504.07079v1
Date: Wed, 09 Apr 2025 17:51:50 GMT
Title: SkillWeaver: Web Agents can Self-Improve by Discovering and Honing Skills
Authors: Boyuan Zheng, Michael Y. Fatemi, Xiaolong Jin, Zora Zhiruo Wang, Apurva Gandhi, Yueqi Song, Yu Gu, Jayanth Srinivasa, Gaowen Liu, Graham Neubig, Yu Su,
Abstract summary: We introduce SkillWeaver, a skill-centric framework enabling web agents to self-improve by autonomously synthesizing reusable skills as APIs.<n>Given a new website, the agent autonomously discovers skills, executes them for practice, and distills practice experiences into robust APIs.<n>Experiments on WebArena and real-world websites demonstrate the efficacy of SkillWeaver, achieving relative success rate improvements of 31.8% and 39.8%, respectively.
Score: 48.05057798832005
License: http://creativecommons.org/licenses/by/4.0/
Abstract: To survive and thrive in complex environments, humans have evolved sophisticated self-improvement mechanisms through environment exploration, hierarchical abstraction of experiences into reuseable skills, and collaborative construction of an ever-growing skill repertoire. Despite recent advancements, autonomous web agents still lack crucial self-improvement capabilities, struggling with procedural knowledge abstraction, refining skills, and skill composition. In this work, we introduce SkillWeaver, a skill-centric framework enabling agents to self-improve by autonomously synthesizing reusable skills as APIs. Given a new website, the agent autonomously discovers skills, executes them for practice, and distills practice experiences into robust APIs. Iterative exploration continually expands a library of lightweight, plug-and-play APIs, significantly enhancing the agent's capabilities. Experiments on WebArena and real-world websites demonstrate the efficacy of SkillWeaver, achieving relative success rate improvements of 31.8% and 39.8%, respectively. Additionally, APIs synthesized by strong agents substantially enhance weaker agents through transferable skills, yielding improvements of up to 54.3% on WebArena. These results demonstrate the effectiveness of honing diverse website interactions into APIs, which can be seamlessly shared among various web agents.

Related papers

Contextual Experience Replay for Self-Improvement of Language Agents [47.51006612841945]
We propose Contextual Experience Replay (CER) to enable efficient self-improvement for language agents.<n>CER accumulates and synthesizes past experiences into a dynamic memory buffer.<n>We evaluate CER on the challenging WebArena and VisualWebArena benchmarks.
arXiv Detail & Related papers (2025-06-07T07:47:35Z)
Rethinking Agent Design: From Top-Down Workflows to Bottom-Up Skill Evolution [34.66260172204154]
We introduce a bottom-up agent paradigm that mirrors the human learning process.<n>Agents acquire competence through a trial-and-reasoning mechanism-exploring, reflecting on outcomes, and abstracting skills over time.<n>We evaluate this paradigm in Slay the Spire and Civilization V, where agents perceive through raw visual inputs and act via mouse outputs, the same as human players.
arXiv Detail & Related papers (2025-05-23T09:38:55Z)
WebEvolver: Enhancing Web Agent Self-Improvement with Coevolving World Model [55.276852838877346]
Self-evolving agents are trained on trajectories sampled autonomously based on their own policies.<n>We propose a novel framework that introduces a co-evolving World Model LLM.<n>This world model predicts the next observation based on the current observation and action within the web environment.
arXiv Detail & Related papers (2025-04-23T02:54:31Z)
Inducing Programmatic Skills for Agentic Tasks [53.964176411616]
We propose agent skill induction (ASI) to allow agents to adapt themselves by inducing, verifying, and utilizing program-based skills on the fly.<n>We show that ASI outperforms the static baseline agent and its text-skill counterpart by 23.5% and 11.3% in success rate.
arXiv Detail & Related papers (2025-04-09T12:25:37Z)
An Illusion of Progress? Assessing the Current State of Web Agents [49.76769323750729]
We conduct a comprehensive and rigorous assessment of the current state of web agents.<n>Results depict a very different picture of the competency of current agents, suggesting over-optimism in previously reported results.<n>We introduce Online-Mind2Web, an online evaluation benchmark consisting of 300 diverse and realistic tasks spanning 136 websites.
arXiv Detail & Related papers (2025-04-02T05:51:29Z)
Skill Expansion and Composition in Parameter Space [17.016614374151747]
Parametric Skill Expansion and Composition (PSEC) is a new framework designed to iteratively evolve the agents' capabilities.<n>PSEC exhibits superior capacity to leverage prior knowledge to efficiently tackle new challenges.
arXiv Detail & Related papers (2025-02-09T15:22:38Z)
OpenWebVoyager: Building Multimodal Web Agents via Iterative Real-World Exploration, Feedback and Optimization [66.22117723598872]
We introduce an open-source framework designed to facilitate the development of multimodal web agent. We first train the base model with imitation learning to gain the basic abilities. We then let the agent explore the open web and collect feedback on its trajectories.
arXiv Detail & Related papers (2024-10-25T15:01:27Z)
Disentangled Unsupervised Skill Discovery for Efficient Hierarchical Reinforcement Learning [39.991887534269445]
Disentangled Unsupervised Skill Discovery (DUSDi) is a method for learning disentangled skills that can be efficiently reused to solve downstream tasks. DUSDi decomposes skills into disentangled components, where each skill component only affects one factor of the state space. DUSDi successfully learns disentangled skills, and significantly outperforms previous skill discovery methods when it comes to applying the learned skills to solve downstream tasks.
arXiv Detail & Related papers (2024-10-15T04:13:20Z)
Never-Ending Behavior-Cloning Agent for Robotic Manipulation [38.756955029068294]
NBAgent is a language-conditioned Never-ending Behavior-cloning Agent. It learns observation knowledge of novel 3D scene semantics and robot manipulation skills from skill-shared and skill-specific attributes.
arXiv Detail & Related papers (2024-03-01T07:51:29Z)
WebArena: A Realistic Web Environment for Building Autonomous Agents [92.3291458543633]
We build an environment for language-guided agents that is highly realistic and reproducible. We focus on agents that perform tasks on the web, and create an environment with fully functional websites from four common domains. We release a set of benchmark tasks focusing on evaluating the functional correctness of task completions.
arXiv Detail & Related papers (2023-07-25T22:59:32Z)
Choreographer: Learning and Adapting Skills in Imagination [60.09911483010824]
We present Choreographer, a model-based agent that exploits its world model to learn and adapt skills in imagination. Our method decouples the exploration and skill learning processes, being able to discover skills in the latent state space of the model. Choreographer is able to learn skills both from offline data, and by collecting data simultaneously with an exploration policy.
arXiv Detail & Related papers (2022-11-23T23:31:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.