Mixed-Initiative Dialog for Human-Robot Collaborative Manipulation
- URL: http://arxiv.org/abs/2508.05535v1
- Date: Thu, 07 Aug 2025 16:09:12 GMT
- Title: Mixed-Initiative Dialog for Human-Robot Collaborative Manipulation
- Authors: Albert Yu, Chengshu Li, Luca Macesanu, Arnav Balaji, Ruchira Ray, Raymond Mooney, Roberto Martín-Martín,
- Abstract summary: MICoBot handles the common scenario where both agents, using natural language, take initiative in formulating, accepting, or rejecting proposals on who can best complete different steps of a task.<n>To handle diverse, task-directed dialog, MICoBot makes decisions at three levels: (1) a meta-planner considers human dialog to formulate and code a high-level collaboration strategy, (2) a planner optimally allocates the remaining steps to either agent based on the robot's capabilities, and (3) an action decides the low-level actions to perform or words to say to the human.
- Score: 8.446410154654467
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Effective robotic systems for long-horizon human-robot collaboration must adapt to a wide range of human partners, whose physical behavior, willingness to assist, and understanding of the robot's capabilities may change over time. This demands a tightly coupled communication loop that grants both agents the flexibility to propose, accept, or decline requests as they coordinate toward completing the task effectively. We apply a Mixed-Initiative dialog paradigm to Collaborative human-roBot teaming and propose MICoBot, a system that handles the common scenario where both agents, using natural language, take initiative in formulating, accepting, or rejecting proposals on who can best complete different steps of a task. To handle diverse, task-directed dialog, and find successful collaborative strategies that minimize human effort, MICoBot makes decisions at three levels: (1) a meta-planner considers human dialog to formulate and code a high-level collaboration strategy, (2) a planner optimally allocates the remaining steps to either agent based on the robot's capabilities (measured by a simulation-pretrained affordance model) and the human's estimated availability to help, and (3) an action executor decides the low-level actions to perform or words to say to the human. Our extensive evaluations in simulation and real-world -- on a physical robot with 18 unique human participants over 27 hours -- demonstrate the ability of our method to effectively collaborate with diverse human users, yielding significantly improved task success and user experience than a pure LLM baseline and other agent allocation models. See additional videos and materials at https://robin-lab.cs.utexas.edu/MicoBot/.
Related papers
- Collaborative Gym: A Framework for Enabling and Evaluating Human-Agent Collaboration [51.452664740963066]
Collaborative Gym is a framework enabling asynchronous, tripartite interaction among agents, humans, and task environments.<n>We instantiate Co-Gym with three representative tasks in both simulated and real-world conditions.<n>Our findings reveal that collaborative agents consistently outperform their fully autonomous counterparts in task performance.
arXiv Detail & Related papers (2024-12-20T09:21:15Z) - COHERENT: Collaboration of Heterogeneous Multi-Robot System with Large Language Models [49.24666980374751]
COHERENT is a novel LLM-based task planning framework for collaboration of heterogeneous multi-robot systems.<n>A Proposal-Execution-Feedback-Adjustment mechanism is designed to decompose and assign actions for individual robots.<n>The experimental results show that our work surpasses the previous methods by a large margin in terms of success rate and execution efficiency.
arXiv Detail & Related papers (2024-09-23T15:53:41Z) - LIT: Large Language Model Driven Intention Tracking for Proactive Human-Robot Collaboration -- A Robot Sous-Chef Application [4.519544934630495]
Large Language Models (LLM) and Vision Language Models (VLM) enable robots to ground natural language prompts into control actions.
We propose Language-driven Intention Tracking (LIT) to model the human user's long-term behavior and to predict the next human intention to guide the robot for proactive collaboration.
arXiv Detail & Related papers (2024-06-19T19:18:40Z) - Large Language Model-based Human-Agent Collaboration for Complex Task
Solving [94.3914058341565]
We introduce the problem of Large Language Models (LLMs)-based human-agent collaboration for complex task-solving.
We propose a Reinforcement Learning-based Human-Agent Collaboration method, ReHAC.
This approach includes a policy model designed to determine the most opportune stages for human intervention within the task-solving process.
arXiv Detail & Related papers (2024-02-20T11:03:36Z) - Coordination with Humans via Strategy Matching [5.072077366588174]
We present an algorithm for autonomously recognizing available task-completion strategies by observing human-human teams performing a collaborative task.
By transforming team actions into low dimensional representations using hidden Markov models, we can identify strategies without prior knowledge.
Robot policies are learned on each of the identified strategies to construct a Mixture-of-Experts model that adapts to the task strategies of unseen human partners.
arXiv Detail & Related papers (2022-10-27T01:00:50Z) - Intuitive and Efficient Human-robot Collaboration via Real-time
Approximate Bayesian Inference [4.310882094628194]
Collaborative robots and end-to-end AI, promises flexible automation of human tasks in factories and warehouses.
Humans and cobots will collaborate helping each other.
For these collaborations to be effective and safe, robots need to model, predict and exploit human's intents.
arXiv Detail & Related papers (2022-05-17T23:04:44Z) - Co-GAIL: Learning Diverse Strategies for Human-Robot Collaboration [51.268988527778276]
We present a method for learning a human-robot collaboration policy from human-human collaboration demonstrations.
Our method co-optimizes a human policy and a robot policy in an interactive learning process.
arXiv Detail & Related papers (2021-08-13T03:14:43Z) - Show Me What You Can Do: Capability Calibration on Reachable Workspace
for Human-Robot Collaboration [83.4081612443128]
We show that a short calibration using REMP can effectively bridge the gap between what a non-expert user thinks a robot can reach and the ground-truth.
We show that this calibration procedure not only results in better user perception, but also promotes more efficient human-robot collaborations.
arXiv Detail & Related papers (2021-03-06T09:14:30Z) - Forming Human-Robot Cooperation for Tasks with General Goal using
Evolutionary Value Learning [9.053709318841232]
In Human-Robot Cooperation (HRC), the robot cooperates with humans to accomplish the task together.
Existing approaches assume the human has a specific goal during the cooperation, and the robot infers and acts toward it.
We present the Evolutionary Value Learning (EVL) approach to model the dynamics of the goal specification process in HRC.
arXiv Detail & Related papers (2020-12-19T20:27:09Z) - Joint Mind Modeling for Explanation Generation in Complex Human-Robot
Collaborative Tasks [83.37025218216888]
We propose a novel explainable AI (XAI) framework for achieving human-like communication in human-robot collaborations.
The robot builds a hierarchical mind model of the human user and generates explanations of its own mind as a form of communications.
Results show that the generated explanations of our approach significantly improves the collaboration performance and user perception of the robot.
arXiv Detail & Related papers (2020-07-24T23:35:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.