CoNav: A Benchmark for Human-Centered Collaborative Navigation
- URL: http://arxiv.org/abs/2406.02425v1
- Date: Tue, 4 Jun 2024 15:44:25 GMT
- Title: CoNav: A Benchmark for Human-Centered Collaborative Navigation
- Authors: Changhao Li, Xinyu Sun, Peihao Chen, Jugang Fan, Zixu Wang, Yanxia Liu, Jinhui Zhu, Chuang Gan, Mingkui Tan,
- Abstract summary: We propose a collaborative navigation (CoNav) benchmark.
Our CoNav tackles the critical challenge of constructing a 3D navigation environment with realistic and diverse human activities.
We propose an intention-aware agent for reasoning both long-term and short-term human intention.
- Score: 66.6268966718022
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human-robot collaboration, in which the robot intelligently assists the human with the upcoming task, is an appealing objective. To achieve this goal, the agent needs to be equipped with a fundamental collaborative navigation ability, where the agent should reason human intention by observing human activities and then navigate to the human's intended destination in advance of the human. However, this vital ability has not been well studied in previous literature. To fill this gap, we propose a collaborative navigation (CoNav) benchmark. Our CoNav tackles the critical challenge of constructing a 3D navigation environment with realistic and diverse human activities. To achieve this, we design a novel LLM-based humanoid animation generation framework, which is conditioned on both text descriptions and environmental context. The generated humanoid trajectory obeys the environmental context and can be easily integrated into popular simulators. We empirically find that the existing navigation methods struggle in CoNav task since they neglect the perception of human intention. To solve this problem, we propose an intention-aware agent for reasoning both long-term and short-term human intention. The agent predicts navigation action based on the predicted intention and panoramic observation. The emergent agent behavior including observing humans, avoiding human collision, and navigation reveals the efficiency of the proposed datasets and agents.
Related papers
- Dreaming to Assist: Learning to Align with Human Objectives for Shared Control in High-Speed Racing [10.947581892636629]
Tight coordination is required for effective human-robot teams in domains involving fast dynamics and tactical decisions.
We present Dream2Assist, a framework that combines a rich world model able to infer human objectives and value functions.
We show that the combined human-robot team, when blending its actions with those of the human, outperforms the synthetic humans alone.
arXiv Detail & Related papers (2024-10-14T01:00:46Z) - Aligning Robot Navigation Behaviors with Human Intentions and Preferences [2.9914612342004503]
This dissertation aims to answer the question: "How can we use machine learning methods to align the navigational behaviors of autonomous mobile robots with human intentions and preferences?"
First, this dissertation introduces a new approach to learning navigation behaviors by imitating human-provided demonstrations of the intended navigation task.
Second, this dissertation introduces two algorithms to enhance terrain-aware off-road navigation for mobile robots by learning visual terrain awareness in a self-supervised manner.
arXiv Detail & Related papers (2024-09-16T03:45:00Z) - Habitat 3.0: A Co-Habitat for Humans, Avatars and Robots [119.55240471433302]
Habitat 3.0 is a simulation platform for studying collaborative human-robot tasks in home environments.
It addresses challenges in modeling complex deformable bodies and diversity in appearance and motion.
Human-in-the-loop infrastructure enables real human interaction with simulated robots via mouse/keyboard or a VR interface.
arXiv Detail & Related papers (2023-10-19T17:29:17Z) - Robots That Can See: Leveraging Human Pose for Trajectory Prediction [30.919756497223343]
We present a Transformer based architecture to predict human future trajectories in human-centric environments.
The resulting model captures the inherent uncertainty for future human trajectory prediction.
We identify new agents with limited historical data as a major contributor to error and demonstrate the complementary nature of 3D skeletal poses in reducing prediction error.
arXiv Detail & Related papers (2023-09-29T13:02:56Z) - Gesture2Path: Imitation Learning for Gesture-aware Navigation [54.570943577423094]
We present Gesture2Path, a novel social navigation approach that combines image-based imitation learning with model-predictive control.
We deploy our method on real robots and showcase the effectiveness of our approach for the four gestures-navigation scenarios.
arXiv Detail & Related papers (2022-09-19T23:05:36Z) - GIMO: Gaze-Informed Human Motion Prediction in Context [75.52839760700833]
We propose a large-scale human motion dataset that delivers high-quality body pose sequences, scene scans, and ego-centric views with eye gaze.
Our data collection is not tied to specific scenes, which further boosts the motion dynamics observed from our subjects.
To realize the full potential of gaze, we propose a novel network architecture that enables bidirectional communication between the gaze and motion branches.
arXiv Detail & Related papers (2022-04-20T13:17:39Z) - Navigation Turing Test (NTT): Learning to Evaluate Human-Like Navigation [9.456752543341464]
A key challenge on the path to developing agents that learn complex human-like behavior is the need to quickly and accurately quantify human-likeness.
We address these limitations through a novel automated Navigation Turing Test (ANTT) that learns to predict human judgments of human-likeness.
arXiv Detail & Related papers (2021-05-20T10:14:23Z) - Active Visual Information Gathering for Vision-Language Navigation [115.40768457718325]
Vision-language navigation (VLN) is the task of entailing an agent to carry out navigational instructions inside photo-realistic environments.
One of the key challenges in VLN is how to conduct a robust navigation by mitigating the uncertainty caused by ambiguous instructions and insufficient observation of the environment.
This work draws inspiration from human navigation behavior and endows an agent with an active information gathering ability for a more intelligent VLN policy.
arXiv Detail & Related papers (2020-07-15T23:54:20Z) - Visual Navigation Among Humans with Optimal Control as a Supervisor [72.5188978268463]
We propose an approach that combines learning-based perception with model-based optimal control to navigate among humans.
Our approach is enabled by our novel data-generation tool, HumANav.
We demonstrate that the learned navigation policies can anticipate and react to humans without explicitly predicting future human motion.
arXiv Detail & Related papers (2020-03-20T16:13:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.