Many Episode Learning in a Modular Embodied Agent via End-to-End
Interaction
- URL: http://arxiv.org/abs/2204.08687v1
- Date: Tue, 19 Apr 2022 06:11:46 GMT
- Title: Many Episode Learning in a Modular Embodied Agent via End-to-End
Interaction
- Authors: Yuxuan Sun, Ethan Carlson, Rebecca Qian, Kavya Srinet, Arthur Szlam
- Abstract summary: We give a case study of an embodied machine-learning (ML) powered agent that improves itself via interactions with crowd-workers.
The agent consists of a set of modules, some of which are learned, and others.
We describe how the design of the agent works together with the design of multiple annotation interfaces.
- Score: 22.14911101362573
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this work we give a case study of an embodied machine-learning (ML)
powered agent that improves itself via interactions with crowd-workers. The
agent consists of a set of modules, some of which are learned, and others
heuristic. While the agent is not "end-to-end" in the ML sense, end-to-end
interaction is a vital part of the agent's learning mechanism. We describe how
the design of the agent works together with the design of multiple annotation
interfaces to allow crowd-workers to assign credit to module errors from
end-to-end interactions, and to label data for individual modules. Over
multiple automated human-agent interaction, credit assignment, data annotation,
and model re-training and re-deployment, rounds we demonstrate agent
improvement.
Related papers
- Agentic Web: Weaving the Next Web with AI Agents [109.13815627467514]
The emergence of AI agents powered by large language models (LLMs) marks a pivotal shift toward the Agentic Web.<n>In this paradigm, agents interact directly with one another to plan, coordinate, and execute complex tasks on behalf of users.<n>We present a structured framework for understanding and building the Agentic Web.
arXiv Detail & Related papers (2025-07-28T17:58:12Z) - PersonaAgent: When Large Language Model Agents Meet Personalization at Test Time [87.99027488664282]
PersonaAgent is a framework designed to address versatile personalization tasks.<n>It integrates a personalized memory module and a personalized action module.<n>Test-time user-preference alignment strategy ensures real-time user preference alignment.
arXiv Detail & Related papers (2025-06-06T17:29:49Z) - PC-Agent: A Hierarchical Multi-Agent Collaboration Framework for Complex Task Automation on PC [98.82146219495792]
In this paper, we propose a hierarchical agent framework named PC-Agent.
From the perception perspective, we devise an Active Perception Module (APM) to overcome the inadequate abilities of current MLLMs in perceiving screenshot content.
From the decision-making perspective, to handle complex user instructions and interdependent subtasks more effectively, we propose a hierarchical multi-agent collaboration architecture.
arXiv Detail & Related papers (2025-02-20T05:41:55Z) - AutoGen Studio: A No-Code Developer Tool for Building and Debugging Multi-Agent Systems [31.113305753414913]
AUTOGEN STUDIO is a no-code developer tool for rapidly prototyping multi-agent systems.
It provides an intuitive drag-and-drop UI for agent specification, interactive evaluation, and a gallery of reusable agent components.
arXiv Detail & Related papers (2024-08-09T03:27:37Z) - SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering [79.07755560048388]
SWE-agent is a system that facilitates LM agents to autonomously use computers to solve software engineering tasks.
SWE-agent's custom agent-computer interface (ACI) significantly enhances an agent's ability to create and edit code files, navigate entire repositories, and execute tests and other programs.
We evaluate SWE-agent on SWE-bench and HumanEvalFix, achieving state-of-the-art performance on both with a pass@1 rate of 12.5% and 87.7%, respectively.
arXiv Detail & Related papers (2024-05-06T17:41:33Z) - Verco: Learning Coordinated Verbal Communication for Multi-agent Reinforcement Learning [42.27106057372819]
We propose a novel multi-agent reinforcement learning algorithm that embeds large language models into agents.
The framework has a message module and an action module.
Experiments conducted on the Overcooked game demonstrate our method significantly enhances the learning efficiency and performance of existing methods.
arXiv Detail & Related papers (2024-04-27T05:10:33Z) - AgentScope: A Flexible yet Robust Multi-Agent Platform [66.64116117163755]
AgentScope is a developer-centric multi-agent platform with message exchange as its core communication mechanism.
The abundant syntactic tools, built-in agents and service functions, user-friendly interfaces for application demonstration and utility monitor, zero-code programming workstation, and automatic prompt tuning mechanism significantly lower the barriers to both development and deployment.
arXiv Detail & Related papers (2024-02-21T04:11:28Z) - AgentCF: Collaborative Learning with Autonomous Language Agents for
Recommender Systems [112.76941157194544]
We propose AgentCF for simulating user-item interactions in recommender systems through agent-based collaborative filtering.
We creatively consider not only users but also items as agents, and develop a collaborative learning approach that optimize both kinds of agents together.
Overall, the optimized agents exhibit diverse interaction behaviors within our framework, including user-item, user-user, item-item, and collective interactions.
arXiv Detail & Related papers (2023-10-13T16:37:14Z) - AutoAgents: A Framework for Automatic Agent Generation [27.74332323317923]
AutoAgents is an innovative framework that adaptively generates and coordinates multiple specialized agents to build an AI team according to different tasks.
Our experiments on various benchmarks demonstrate that AutoAgents generates more coherent and accurate solutions than the existing multi-agent methods.
arXiv Detail & Related papers (2023-09-29T14:46:30Z) - MADiff: Offline Multi-agent Learning with Diffusion Models [79.18130544233794]
Diffusion model (DM) recently achieved huge success in various scenarios including offline reinforcement learning.
We propose MADiff, a novel generative multi-agent learning framework to tackle this problem.
Our experiments show the superior performance of MADiff compared to baseline algorithms in a wide range of multi-agent learning tasks.
arXiv Detail & Related papers (2023-05-27T02:14:09Z) - Multi-Agent Imitation Learning with Copulas [102.27052968901894]
Multi-agent imitation learning aims to train multiple agents to perform tasks from demonstrations by learning a mapping between observations and actions.
In this paper, we propose to use copula, a powerful statistical tool for capturing dependence among random variables, to explicitly model the correlation and coordination in multi-agent systems.
Our proposed model is able to separately learn marginals that capture the local behavioral patterns of each individual agent, as well as a copula function that solely and fully captures the dependence structure among agents.
arXiv Detail & Related papers (2021-07-10T03:49:41Z) - BGC: Multi-Agent Group Belief with Graph Clustering [1.9949730506194252]
We propose a semi-communication method to enable agents can exchange information without communication.
Inspired by the neighborhood cognitive consistency, we propose a group-based module to divide adjacent agents into a small group and minimize in-group agents' beliefs.
Results reveal that the proposed method achieves a significant improvement in the SMAC benchmark.
arXiv Detail & Related papers (2020-08-20T07:07:20Z) - Multi-Agent Interactions Modeling with Correlated Policies [53.38338964628494]
In this paper, we cast the multi-agent interactions modeling problem into a multi-agent imitation learning framework.
We develop a Decentralized Adrial Imitation Learning algorithm with Correlated policies (CoDAIL)
Various experiments demonstrate that CoDAIL can better regenerate complex interactions close to the demonstrators.
arXiv Detail & Related papers (2020-01-04T17:31:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.