Related papers: Leveraging Dual Process Theory in Language Agent Framework for Real-time Simultaneous Human-AI Collaboration

Leveraging Dual Process Theory in Language Agent Framework for Real-time Simultaneous Human-AI Collaboration

URL: http://arxiv.org/abs/2502.11882v5
Date: Wed, 28 May 2025 12:14:14 GMT
Title: Leveraging Dual Process Theory in Language Agent Framework for Real-time Simultaneous Human-AI Collaboration
Authors: Shao Zhang, Xihuai Wang, Wenhao Zhang, Chaoran Li, Junru Song, Tingyu Li, Lin Qiu, Xuezhi Cao, Xunliang Cai, Wen Yao, Weinan Zhang, Xinbing Wang, Ying Wen,
Abstract summary: Large language models (LLMs) have excelled in turn-by-turn human-AI collaboration but struggle with simultaneous tasks requiring real-time interaction.<n>We propose DPT-Agent, a novel language agent framework that integrates System 1 and System 2 for efficient real-time simultaneous human-AI collaboration.
Score: 46.7389974883868
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Agents built on large language models (LLMs) have excelled in turn-by-turn human-AI collaboration but struggle with simultaneous tasks requiring real-time interaction. Latency issues and the challenge of inferring variable human strategies hinder their ability to make autonomous decisions without explicit instructions. Through experiments with current independent System 1 and System 2 methods, we validate the necessity of using Dual Process Theory (DPT) in real-time tasks. We propose DPT-Agent, a novel language agent framework that integrates System 1 and System 2 for efficient real-time simultaneous human-AI collaboration. DPT-Agent's System 1 uses a Finite-state Machine (FSM) and code-as-policy for fast, intuitive, and controllable decision-making. DPT-Agent's System 2 integrates Theory of Mind (ToM) and asynchronous reflection to infer human intentions and perform reasoning-based autonomous decisions. We demonstrate the effectiveness of DPT-Agent through further experiments with rule-based agents and human collaborators, showing significant improvements over mainstream LLM-based frameworks. DPT-Agent can effectively help LLMs convert correct slow thinking and reasoning into executable actions, thereby improving performance. To the best of our knowledge, DPT-Agent is the first language agent framework that achieves successful real-time simultaneous human-AI collaboration autonomously. Code of DPT-Agent can be found in https://github.com/sjtu-marl/DPT-Agent.

Related papers

AgentArk: Distilling Multi-Agent Intelligence into a Single LLM Agent [57.10083973844841]
AgentArk is a novel framework to distill multi-agent dynamics into the weights of a single model.<n>We investigate three hierarchical distillation strategies across various models, tasks, scaling, and scenarios.<n>By shifting the burden of computation from inference to training, the distilled models preserve the efficiency of one agent while exhibiting strong reasoning and self-correction performance of multiple agents.
arXiv Detail & Related papers (2026-02-03T19:18:28Z)
DLLM-Searcher: Adapting Diffusion Large Language Model for Search Agents [31.08047797205678]
Diffusion Large Language Models (dLLLLMs) have demonstrated unique efficiency advantages, enabled by their inherently parallel decoding mechanism and flexible generation paradigm.<n>Despite the rapid advancement of Search Agents, their practical deployment is constrained by a fundamental limitation termed as 1) Challenge: the serial execution of multi-round reasoning, tool calling, and tool response waiting under the ReAct agent paradigm.<n>In this paper, we propose an optimization framework for dLLM-based Search Agents.
arXiv Detail & Related papers (2026-02-03T09:12:08Z)
The Bitter Lesson of Diffusion Language Models for Agentic Workflows: A Comprehensive Reality Check [54.08619694620588]
We present a comprehensive evaluation of dLLMs across two distinct agentic paradigms: Embodied Agents and Tool-Calling Agents.<n>Our results on Agentboard and BFCL reveal a "bitter lesson": current dLLMs fail to serve as reliable agentic backbones.
arXiv Detail & Related papers (2026-01-19T11:45:39Z)
AgentInit: Initializing LLM-based Multi-Agent Systems via Diversity and Expertise Orchestration for Effective and Efficient Collaboration [35.78052021610084]
We propose AgentInit, which aims to optimize the structure of agent teams.<n>In addition to multi-round interactions and reflections between agents during agent generation, AgentInit incorporates a Natural Language to Format mechanism.
arXiv Detail & Related papers (2025-09-23T16:58:54Z)
DoubleAgents: Exploring Mechanisms of Building Trust with Proactive AI [29.777890680647186]
DoubleAgents is an agentic planning tool that embeds transparency and control through user intervention.<n>A built-in respondent simulation generates realistic scenarios, allowing users to rehearse, refine policies, and calibrate their reliance.
arXiv Detail & Related papers (2025-09-16T03:43:13Z)
A Study on the MCP x A2A Framework for Enhancing Interoperability of LLM-based Autonomous Agents [0.0]
In modern AI systems, collaboration between autonomous agents and integration with external tools have become essential elements for building practical AI applications.<n>This paper provides an in-depth technical analysis and implementation methodology of the open-source Agent-to-Agent (A2A) protocol developed by Google and the Model Context Protocol (MCP) introduced by Anthropic.
arXiv Detail & Related papers (2025-06-02T15:46:16Z)
R&D-Agent: An LLM-Agent Framework Towards Autonomous Data Science [70.1638335489284]
High-level machine learning engineering tasks remain labor-intensive and iterative.<n>We introduce R&D-Agent, a comprehensive, decoupled, and framework that formalizes the machine learning process.<n>R&D-Agent defines the MLE into two phases and six components, turning agent design for MLE into a principled, testable process.
arXiv Detail & Related papers (2025-05-20T06:07:00Z)
IntellAgent: A Multi-Agent Framework for Evaluating Conversational AI Systems [2.2810745411557316]
We introduce IntellAgent, a scalable, open-source framework to evaluate conversational AI systems.<n>IntellAgent automates the creation of synthetic benchmarks by combining policy-driven graph modeling, realistic event generation, and interactive user-agent simulations.<n>Our findings demonstrate that IntellAgent serves as an effective framework for advancing conversational AI by addressing challenges in bridging research and deployment.
arXiv Detail & Related papers (2025-01-19T14:58:35Z)
Two Heads Are Better Than One: Collaborative LLM Embodied Agents for Human-Robot Interaction [1.6574413179773757]
Large language models (LLMs) should be able to leverage their large breadth of understanding to interpret natural language commands. However, these models suffer from hallucinations, which may cause safety issues or deviations from the task. In this research, multiple collaborative AI systems were tested against a single independent AI agent to determine whether the success in other domains would translate into improved human-robot interaction performance.
arXiv Detail & Related papers (2024-11-23T02:47:12Z)
Agent-Oriented Planning in Multi-Agent Systems [54.429028104022066]
We propose AOP, a novel framework for agent-oriented planning in multi-agent systems. In this study, we identify three critical design principles of agent-oriented planning, including solvability, completeness, and non-redundancy. Extensive experiments demonstrate the advancement of AOP in solving real-world problems compared to both single-agent systems and existing planning strategies for multi-agent systems.
arXiv Detail & Related papers (2024-10-03T04:07:51Z)
Interactive Speculative Planning: Enhance Agent Efficiency through Co-design of System and User Interface [38.76937539085164]
This paper presents a human-centered efficient agent planning method -- Interactive Speculative Planning. We aim at enhancing the efficiency of agent planning through both system design and human-AI interaction.
arXiv Detail & Related papers (2024-09-30T16:52:51Z)
SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering [79.07755560048388]
SWE-agent is a system that facilitates LM agents to autonomously use computers to solve software engineering tasks. SWE-agent's custom agent-computer interface (ACI) significantly enhances an agent's ability to create and edit code files, navigate entire repositories, and execute tests and other programs. We evaluate SWE-agent on SWE-bench and HumanEvalFix, achieving state-of-the-art performance on both with a pass@1 rate of 12.5% and 87.7%, respectively.
arXiv Detail & Related papers (2024-05-06T17:41:33Z)
Learning to Use Tools via Cooperative and Interactive Agents [58.77710337157665]
Tool learning empowers large language models (LLMs) as agents to use external tools and extend their utility. We propose ConAgents, a Cooperative and interactive Agents framework, which coordinates three specialized agents for tool selection, tool execution, and action calibration separately. Our experiments on three datasets show that the LLMs, when equipped with ConAgents, outperform baselines with substantial improvement.
arXiv Detail & Related papers (2024-03-05T15:08:16Z)
EHRAgent: Code Empowers Large Language Models for Few-shot Complex Tabular Reasoning on Electronic Health Records [47.5632532642591]
Large language models (LLMs) have demonstrated exceptional capabilities in planning and tool utilization. We propose EHRAgent, an LLM agent empowered with a code interface, to autonomously generate and execute code for multi-tabular reasoning.
arXiv Detail & Related papers (2024-01-13T18:09:05Z)
SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent Systems [53.94772445896213]
Large Language Model (LLM)-based multi-agent systems have demonstrated promising performance in simulating human society. We propose SpeechAgents, a multi-modal LLM based multi-agent system designed for simulating human communication.
arXiv Detail & Related papers (2024-01-08T15:01:08Z)
XUAT-Copilot: Multi-Agent Collaborative System for Automated User Acceptance Testing with Large Language Model [9.05375318147931]
We propose an LLM-powered multi-agent collaborative system, named XUAT-Copilot, for automated UAT. The proposed system mainly consists of three LLM-based agents responsible for action planning, state checking and parameter selecting, respectively, and two additional modules for state sensing and case rewriting. The system achieves a close effectiveness to human testers in our experimental studies and gains a significant improvement of Pass@1 accuracy compared with single-agent architecture.
arXiv Detail & Related papers (2024-01-05T08:24:30Z)
LLM-Powered Hierarchical Language Agent for Real-time Human-AI Coordination [28.22553394518179]
We propose a Hierarchical Language Agent (HLA) for human-AI coordination. HLA provides both strong reasoning abilities while keeping real-time execution. Human studies show that HLA outperforms other baseline agents, including slow-mind-only agents and fast-mind-only agents.
arXiv Detail & Related papers (2023-12-23T11:09:48Z)
ProAgent: Building Proactive Cooperative Agents with Large Language Models [89.53040828210945]
ProAgent is a novel framework that harnesses large language models to create proactive agents. ProAgent can analyze the present state, and infer the intentions of teammates from observations. ProAgent exhibits a high degree of modularity and interpretability, making it easily integrated into various coordination scenarios.
arXiv Detail & Related papers (2023-08-22T10:36:56Z)
MADiff: Offline Multi-agent Learning with Diffusion Models [79.18130544233794]
MADiff is a diffusion-based multi-agent learning framework.<n>It works as both a decentralized policy and a centralized controller.<n>Our experiments demonstrate that MADiff outperforms baseline algorithms across various multi-agent learning tasks.
arXiv Detail & Related papers (2023-05-27T02:14:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.