CGMI: Configurable General Multi-Agent Interaction Framework
- URL: http://arxiv.org/abs/2308.12503v2
- Date: Mon, 28 Aug 2023 05:44:39 GMT
- Title: CGMI: Configurable General Multi-Agent Interaction Framework
- Authors: Shi Jinxin, Zhao Jiabao, Wang Yilei, Wu Xingjiao, Li Jiawen, He Liang
- Abstract summary: General Multi-Agent Interaction (CGMI) framework designed to replicate human interactions in real-world scenarios.
We propose a tree-structured methodology for the assignment, detection, and maintenance of agent personality.
We have also integrated general agents to augment the virtual environment's realism.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Benefiting from the powerful capabilities of large language models (LLMs),
agents based on LLMs have shown the potential to address domain-specific tasks
and emulate human behaviors. However, the content generated by these agents
remains somewhat superficial, owing to their limited domain expertise and the
absence of an effective cognitive architecture. To address this, we present the
Configurable General Multi-Agent Interaction (CGMI) framework, designed to
replicate human interactions in real-world scenarios. Specifically, we propose
a tree-structured methodology for the assignment, detection, and maintenance of
agent personality. Additionally, we designed a cognitive architecture equipped
with a skill library based on the ACT* model, which contains memory,
reflection, and planning modules. We have also integrated general agents to
augment the virtual environment's realism. Using the CGMI framework, we
simulated numerous classroom interactions between teacher and students. The
experiments indicate that aspects such as the teaching methodology, curriculum,
and student performance closely mirror real classroom settings. We will open
source our work.
Related papers
- APT: Architectural Planning and Text-to-Blueprint Construction Using Large Language Models for Open-World Agents [8.479128275067742]
We present an advanced Large Language Model (LLM)-driven framework that enables autonomous agents to construct complex structures in Minecraft.
By employing chain-of-thought decomposition along with multimodal inputs, the framework generates detailed architectural layouts and blueprints.
Our agent incorporates both memory and reflection modules to facilitate lifelong learning, adaptive refinement, and error correction throughout the building process.
arXiv Detail & Related papers (2024-11-26T09:31:28Z) - Flex: End-to-End Text-Instructed Visual Navigation with Foundation Models [59.892436892964376]
We investigate the minimal data requirements and architectural adaptations necessary to achieve robust closed-loop performance with vision-based control policies.
Our findings are synthesized in Flex (Fly-lexically), a framework that uses pre-trained Vision Language Models (VLMs) as frozen patch-wise feature extractors.
We demonstrate the effectiveness of this approach on quadrotor fly-to-target tasks, where agents trained via behavior cloning successfully generalize to real-world scenes.
arXiv Detail & Related papers (2024-10-16T19:59:31Z) - WorkArena++: Towards Compositional Planning and Reasoning-based Common Knowledge Work Tasks [85.95607119635102]
Large language models (LLMs) can mimic human-like intelligence.
WorkArena++ is designed to evaluate the planning, problem-solving, logical/arithmetic reasoning, retrieval, and contextual understanding abilities of web agents.
arXiv Detail & Related papers (2024-07-07T07:15:49Z) - Simulating Classroom Education with LLM-Empowered Agents [52.62324491261461]
SimClass is a multi-agent classroom simulation framework involving user participation.
We recognize representative class roles and introduce a novel class control mechanism for automatic classroom teaching.
We demonstrate that LLMs can simulate traditional classroom interaction patterns effectively while enhancing user's experience.
arXiv Detail & Related papers (2024-06-27T14:51:07Z) - MEIA: Multimodal Embodied Perception and Interaction in Unknown Environments [82.67236400004826]
We introduce the Multimodal Embodied Interactive Agent (MEIA), capable of translating high-level tasks expressed in natural language into a sequence of executable actions.
MEM module enables MEIA to generate executable action plans based on diverse requirements and the robot's capabilities.
arXiv Detail & Related papers (2024-02-01T02:43:20Z) - Multi-Agent Collaboration: Harnessing the Power of Intelligent LLM
Agents [0.0]
We present a novel framework for enhancing the capabilities of large language models (LLMs) by leveraging the power of multi-agent systems.
Our framework introduces a collaborative environment where multiple intelligent agent components, each with distinctive attributes and roles, work together to handle complex tasks more efficiently and effectively.
arXiv Detail & Related papers (2023-06-05T23:55:37Z) - Knowledge-enhanced Agents for Interactive Text Games [16.055119735473017]
We propose a knowledge-injection framework for improved functional grounding of agents in text-based games.
We consider two forms of domain knowledge that we inject into learning-based agents: memory of previous correct actions and affordances of relevant objects in the environment.
Our framework supports two representative model classes: reinforcement learning agents and language model agents.
arXiv Detail & Related papers (2023-05-08T23:31:39Z) - ArK: Augmented Reality with Knowledge Interactive Emergent Ability [115.72679420999535]
We develop an infinite agent that learns to transfer knowledge memory from general foundation models to novel domains.
The heart of our approach is an emerging mechanism, dubbed Augmented Reality with Knowledge Inference Interaction (ArK)
We show that our ArK approach, combined with large foundation models, significantly improves the quality of generated 2D/3D scenes.
arXiv Detail & Related papers (2023-05-01T17:57:01Z) - Sim-To-Real Transfer of Visual Grounding for Human-Aided Ambiguity
Resolution [0.0]
We consider the task of visual grounding, where the agent segments an object from a crowded scene given a natural language description.
Modern holistic approaches to visual grounding usually ignore language structure and struggle to cover generic domains.
We introduce a fully decoupled modular framework for compositional visual grounding of entities, attributes, and spatial relations.
arXiv Detail & Related papers (2022-05-24T14:12:32Z) - Learning Multi-Objective Curricula for Deep Reinforcement Learning [55.27879754113767]
Various automatic curriculum learning (ACL) methods have been proposed to improve the sample efficiency and final performance of deep reinforcement learning (DRL)
In this paper, we propose a unified automatic curriculum learning framework to create multi-objective but coherent curricula.
In addition to existing hand-designed curricula paradigms, we further design a flexible memory mechanism to learn an abstract curriculum.
arXiv Detail & Related papers (2021-10-06T19:30:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.