EgoCogNav: Cognition-aware Human Egocentric Navigation
- URL: http://arxiv.org/abs/2511.17581v1
- Date: Sat, 15 Nov 2025 15:59:36 GMT
- Title: EgoCogNav: Cognition-aware Human Egocentric Navigation
- Authors: Zhiwen Qiu, Ziang Liu, Wenqian Niu, Tapomayukh Bhattacharjee, Saleh Kalantari,
- Abstract summary: EgoCogNav is a multimodal egocentric navigation framework that predicts perceived path uncertainty as a latent state.<n>We show that EgoCogNav learns the perceived uncertainty that highly correlates with human-like behaviors such as scanning, hesitation, and backtracking while generalizing to unseen environments.
- Score: 6.817711914976566
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Modeling the cognitive and experiential factors of human navigation is central to deepening our understanding of human-environment interaction and to enabling safe social navigation and effective assistive wayfinding. Most existing methods focus on forecasting motions in fully observed scenes and often neglect human factors that capture how people feel and respond to space. To address this gap, We propose EgoCogNav, a multimodal egocentric navigation framework that predicts perceived path uncertainty as a latent state and jointly forecasts trajectories and head motion by fusing scene features with sensory cues. To facilitate research in the field, we introduce the Cognition-aware Egocentric Navigation (CEN) dataset consisting 6 hours of real-world egocentric recordings capturing diverse navigation behaviors in real-world scenarios. Experiments show that EgoCogNav learns the perceived uncertainty that highly correlates with human-like behaviors such as scanning, hesitation, and backtracking while generalizing to unseen environments.
Related papers
- HyPerNav: Hybrid Perception for Object-Oriented Navigation in Unknown Environment [9.835605248219586]
We propose Hybrid Perception Navigation (HyNav) to enhance the effectiveness and intelligence of navigation in unknown environments.<n>Our method captures cues richer and finds the objects more effectively, by simultaneously leveraging information understanding from egocentric observations and the top-down map.
arXiv Detail & Related papers (2025-10-27T01:43:56Z) - Human-like Navigation in a World Built for Humans [23.303995665820846]
We present ReasonNav, a modular navigation system which integrates human-like navigation skills.<n>We design compact input and output abstractions based on navigation landmarks.<n>We show that ReasonNav successfully employs higher-order reasoning to navigate efficiently in large, complex buildings.
arXiv Detail & Related papers (2025-09-25T14:04:17Z) - LookOut: Real-World Humanoid Egocentric Navigation [61.14016011125957]
We introduce the challenging problem of predicting a sequence of future 6D head poses from an egocentric video.<n>To solve this task, we propose a framework that reasons over temporally aggregated 3D latent features.<n>Motivated by the lack of training data in this space, we present a dataset collected through this approach.
arXiv Detail & Related papers (2025-08-20T06:43:36Z) - ForesightNav: Learning Scene Imagination for Efficient Exploration [57.49417653636244]
We propose ForesightNav, a novel exploration strategy inspired by human imagination and reasoning.<n>Our approach equips robotic agents with the capability to predict contextual information, such as occupancy and semantic details, for unexplored regions.<n>We validate our imagination-based approach using the Structured3D dataset, demonstrating accurate occupancy prediction and superior performance in anticipating unseen scene geometry.
arXiv Detail & Related papers (2025-04-22T17:38:38Z) - CoNav: A Benchmark for Human-Centered Collaborative Navigation [66.6268966718022]
We propose a collaborative navigation (CoNav) benchmark.
Our CoNav tackles the critical challenge of constructing a 3D navigation environment with realistic and diverse human activities.
We propose an intention-aware agent for reasoning both long-term and short-term human intention.
arXiv Detail & Related papers (2024-06-04T15:44:25Z) - Learning Navigational Visual Representations with Semantic Map
Supervision [85.91625020847358]
We propose a navigational-specific visual representation learning method by contrasting the agent's egocentric views and semantic maps.
Ego$2$-Map learning transfers the compact and rich information from a map, such as objects, structure and transition, to the agent's egocentric representations for navigation.
arXiv Detail & Related papers (2023-07-23T14:01:05Z) - SACSoN: Scalable Autonomous Control for Social Navigation [62.59274275261392]
We develop methods for training policies for socially unobtrusive navigation.
By minimizing this counterfactual perturbation, we can induce robots to behave in ways that do not alter the natural behavior of humans in the shared space.
We collect a large dataset where an indoor mobile robot interacts with human bystanders.
arXiv Detail & Related papers (2023-06-02T19:07:52Z) - Egocentric Human Trajectory Forecasting with a Wearable Camera and
Multi-Modal Fusion [24.149925005674145]
We address the problem of forecasting the trajectory of an egocentric camera wearer (ego-person) in crowded spaces.
The trajectory forecasting ability learned from the data of different camera wearers can be transferred to assist visually impaired people in navigation.
A Transformer-based encoder-decoder neural network model, integrated with a novel cascaded cross-attention mechanism has been designed to predict the future trajectory of the camera wearer.
arXiv Detail & Related papers (2021-11-01T14:58:05Z) - Active Visual Information Gathering for Vision-Language Navigation [115.40768457718325]
Vision-language navigation (VLN) is the task of entailing an agent to carry out navigational instructions inside photo-realistic environments.
One of the key challenges in VLN is how to conduct a robust navigation by mitigating the uncertainty caused by ambiguous instructions and insufficient observation of the environment.
This work draws inspiration from human navigation behavior and endows an agent with an active information gathering ability for a more intelligent VLN policy.
arXiv Detail & Related papers (2020-07-15T23:54:20Z) - Visual Navigation Among Humans with Optimal Control as a Supervisor [72.5188978268463]
We propose an approach that combines learning-based perception with model-based optimal control to navigate among humans.
Our approach is enabled by our novel data-generation tool, HumANav.
We demonstrate that the learned navigation policies can anticipate and react to humans without explicitly predicting future human motion.
arXiv Detail & Related papers (2020-03-20T16:13:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.