Related papers: CoNav: A Benchmark for Human-Centered Collaborative Navigation

CoNav: A Benchmark for Human-Centered Collaborative Navigation

URL: http://arxiv.org/abs/2406.02425v1
Date: Tue, 4 Jun 2024 15:44:25 GMT
Title: CoNav: A Benchmark for Human-Centered Collaborative Navigation
Authors: Changhao Li, Xinyu Sun, Peihao Chen, Jugang Fan, Zixu Wang, Yanxia Liu, Jinhui Zhu, Chuang Gan, Mingkui Tan,
Abstract summary: We propose a collaborative navigation (CoNav) benchmark. Our CoNav tackles the critical challenge of constructing a 3D navigation environment with realistic and diverse human activities. We propose an intention-aware agent for reasoning both long-term and short-term human intention.
Score: 66.6268966718022
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Human-robot collaboration, in which the robot intelligently assists the human with the upcoming task, is an appealing objective. To achieve this goal, the agent needs to be equipped with a fundamental collaborative navigation ability, where the agent should reason human intention by observing human activities and then navigate to the human's intended destination in advance of the human. However, this vital ability has not been well studied in previous literature. To fill this gap, we propose a collaborative navigation (CoNav) benchmark. Our CoNav tackles the critical challenge of constructing a 3D navigation environment with realistic and diverse human activities. To achieve this, we design a novel LLM-based humanoid animation generation framework, which is conditioned on both text descriptions and environmental context. The generated humanoid trajectory obeys the environmental context and can be easily integrated into popular simulators. We empirically find that the existing navigation methods struggle in CoNav task since they neglect the perception of human intention. To solve this problem, we propose an intention-aware agent for reasoning both long-term and short-term human intention. The agent predicts navigation action based on the predicted intention and panoramic observation. The emergent agent behavior including observing humans, avoiding human collision, and navigation reveals the efficiency of the proposed datasets and agents.

Related papers

UPTor: Unified 3D Human Pose Dynamics and Trajectory Prediction for Human-Robot Interaction [0.688204255655161]
We propose a technique to predict full-body pose and trajectory key-points in a global coordinate frame.<n>We use an off-the-shelf 3D human pose estimation module, a graph attention network, and a compact, non-autoregressive transformer.<n>In comparison to prior work, we show that our approach is compact, real-time, and accurate in predicting human navigation motion across all datasets.
arXiv Detail & Related papers (2025-05-20T19:57:25Z)
ForesightNav: Learning Scene Imagination for Efficient Exploration [57.49417653636244]
We propose ForesightNav, a novel exploration strategy inspired by human imagination and reasoning. Our approach equips robotic agents with the capability to predict contextual information, such as occupancy and semantic details, for unexplored regions. We validate our imagination-based approach using the Structured3D dataset, demonstrating accurate occupancy prediction and superior performance in anticipating unseen scene geometry.
arXiv Detail & Related papers (2025-04-22T17:38:38Z)
Dreaming to Assist: Learning to Align with Human Objectives for Shared Control in High-Speed Racing [10.947581892636629]
Tight coordination is required for effective human-robot teams in domains involving fast dynamics and tactical decisions. We present Dream2Assist, a framework that combines a rich world model able to infer human objectives and value functions. We show that the combined human-robot team, when blending its actions with those of the human, outperforms the synthetic humans alone.
arXiv Detail & Related papers (2024-10-14T01:00:46Z)
Aligning Robot Navigation Behaviors with Human Intentions and Preferences [2.9914612342004503]
This dissertation aims to answer the question: "How can we use machine learning methods to align the navigational behaviors of autonomous mobile robots with human intentions and preferences?" First, this dissertation introduces a new approach to learning navigation behaviors by imitating human-provided demonstrations of the intended navigation task. Second, this dissertation introduces two algorithms to enhance terrain-aware off-road navigation for mobile robots by learning visual terrain awareness in a self-supervised manner.
arXiv Detail & Related papers (2024-09-16T03:45:00Z)
Habitat 3.0: A Co-Habitat for Humans, Avatars and Robots [119.55240471433302]
Habitat 3.0 is a simulation platform for studying collaborative human-robot tasks in home environments. It addresses challenges in modeling complex deformable bodies and diversity in appearance and motion. Human-in-the-loop infrastructure enables real human interaction with simulated robots via mouse/keyboard or a VR interface.
arXiv Detail & Related papers (2023-10-19T17:29:17Z)
Robots That Can See: Leveraging Human Pose for Trajectory Prediction [30.919756497223343]
We present a Transformer based architecture to predict human future trajectories in human-centric environments. The resulting model captures the inherent uncertainty for future human trajectory prediction. We identify new agents with limited historical data as a major contributor to error and demonstrate the complementary nature of 3D skeletal poses in reducing prediction error.
arXiv Detail & Related papers (2023-09-29T13:02:56Z)
Gesture2Path: Imitation Learning for Gesture-aware Navigation [54.570943577423094]
We present Gesture2Path, a novel social navigation approach that combines image-based imitation learning with model-predictive control. We deploy our method on real robots and showcase the effectiveness of our approach for the four gestures-navigation scenarios.
arXiv Detail & Related papers (2022-09-19T23:05:36Z)
GIMO: Gaze-Informed Human Motion Prediction in Context [75.52839760700833]
We propose a large-scale human motion dataset that delivers high-quality body pose sequences, scene scans, and ego-centric views with eye gaze. Our data collection is not tied to specific scenes, which further boosts the motion dynamics observed from our subjects. To realize the full potential of gaze, we propose a novel network architecture that enables bidirectional communication between the gaze and motion branches.
arXiv Detail & Related papers (2022-04-20T13:17:39Z)
Navigation Turing Test (NTT): Learning to Evaluate Human-Like Navigation [9.456752543341464]
A key challenge on the path to developing agents that learn complex human-like behavior is the need to quickly and accurately quantify human-likeness. We address these limitations through a novel automated Navigation Turing Test (ANTT) that learns to predict human judgments of human-likeness.
arXiv Detail & Related papers (2021-05-20T10:14:23Z)
Active Visual Information Gathering for Vision-Language Navigation [115.40768457718325]
Vision-language navigation (VLN) is the task of entailing an agent to carry out navigational instructions inside photo-realistic environments. One of the key challenges in VLN is how to conduct a robust navigation by mitigating the uncertainty caused by ambiguous instructions and insufficient observation of the environment. This work draws inspiration from human navigation behavior and endows an agent with an active information gathering ability for a more intelligent VLN policy.
arXiv Detail & Related papers (2020-07-15T23:54:20Z)
Visual Navigation Among Humans with Optimal Control as a Supervisor [72.5188978268463]
We propose an approach that combines learning-based perception with model-based optimal control to navigate among humans. Our approach is enabled by our novel data-generation tool, HumANav. We demonstrate that the learned navigation policies can anticipate and react to humans without explicitly predicting future human motion.
arXiv Detail & Related papers (2020-03-20T16:13:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.