Related papers: TAS: A Transit-Aware Strategy for Embodied Navigation with Non-Stationary Targets

Related papers

GIANT - Global Path Integration and Attentive Graph Networks for Multi-Agent Trajectory Planning [4.019914376054815]
This paper presents a novel approach to multi-robot collision avoidance that integrates global path planning with local navigation strategies.<n>We introduce a local navigation model that leverages pre-planned global paths, allowing robots to adhere to optimal routes while dynamically adjusting to environmental changes.<n>Our approach is evaluated against established baselines, including NH-ORCA, DRL-NAV, and GA3C-CADRL, across various structurally diverse simulated scenarios.
arXiv Detail & Related papers (2026-03-04T22:45:53Z)
Dynamic Topology Awareness: Breaking the Granularity Rigidity in Vision-Language Navigation [22.876516699004814]
Vision-Language Navigation in Continuous Environments (VLN-CE) presents a core challenge: grounding high-level linguistic instructions into precise, safe, and long-horizon spatial actions.<n>Explicit topological maps have proven to be a vital solution for providing robust spatial memory in such tasks.<n>Existing topological planning methods suffer from a "Granularity Rigidity" problem.<n>We propose DGNav, a framework for Dynamic Topological Navigation, introducing a context-aware mechanism to modulate map density and connectivity on-the-fly.
arXiv Detail & Related papers (2026-01-29T14:06:23Z)
Nav-$R^2$ Dual-Relation Reasoning for Generalizable Open-Vocabulary Object-Goal Navigation [67.68165784193556]
Nav-$R2$ is a framework that explicitly models two types of relationships, target-environment modeling and environment-action planning.<n>Our SA-Mem preserves the most target-relevant and current observation-relevant features from both temporal and semantic perspectives.<n>Nav-R2 achieves state-of-the-art performance in localizing unseen objects through a streamlined and efficient pipeline.
arXiv Detail & Related papers (2025-12-02T04:21:02Z)
MetricNet: Recovering Metric Scale in Generative Navigation Policies [51.90872764552077]
MetricNet is an effective add-on for generative navigation that predicts the metric distance between waypoints.<n>We show that executing MetricNet-scaled waypoints significantly improves both navigation and exploration performance.<n>We also propose MetricNav, which integrates MetricNet into a navigation policy to guide the robot away from obstacles while still moving towards the goal.
arXiv Detail & Related papers (2025-09-17T13:37:13Z)
TANGO: Traversability-Aware Navigation with Local Metric Control for Topological Goals [10.69725316052444]
We present a novel RGB-only, object-level topometric navigation pipeline that enables zero-shot, long-horizon robot navigation.<n>Our approach integrates global topological path planning with local metric trajectory control, allowing the robot to navigate towards object-level sub-goals while avoiding obstacles.<n>We demonstrate the effectiveness of our method in both simulated environments and real-world tests, highlighting its robustness and deployability.
arXiv Detail & Related papers (2025-09-10T15:43:32Z)
DAgger Diffusion Navigation: DAgger Boosted Diffusion Policy for Vision-Language Navigation [73.80968452950854]
Vision-Language Navigation in Continuous Environments (VLN-CE) requires agents to follow natural language instructions through free-form 3D spaces.<n>Existing VLN-CE approaches typically use a two-stage waypoint planning framework.<n>We propose DAgger Diffusion Navigation (DifNav) as an end-to-end optimized VLN-CE policy.
arXiv Detail & Related papers (2025-08-13T02:51:43Z)
SemNav: A Model-Based Planner for Zero-Shot Object Goal Navigation Using Vision-Foundation Models [10.671262416557704]
Vision Foundation Models (VFMs) offer powerful capabilities for visual understanding and reasoning.<n>We present a zero-shot object goal navigation framework that integrates the perceptual strength of VFMs with a model-based planner.<n>We evaluate our approach on the HM3D dataset using the Habitat simulator and demonstrate that our method achieves state-of-the-art performance.
arXiv Detail & Related papers (2025-06-04T03:04:54Z)
MPVO: Motion-Prior based Visual Odometry for PointGoal Navigation [3.9974562667271507]
Visual odometry (VO) is essential for enabling accurate point-goal navigation of embodied agents in indoor environments. Recent deep-learned VO methods show robust performance but suffer from sample inefficiency during training. We propose a robust and sample-efficient VO pipeline based on motion priors available while an agent is navigating an environment.
arXiv Detail & Related papers (2024-11-07T15:36:49Z)
Personalized Instance-based Navigation Toward User-Specific Objects in Realistic Environments [44.6372390798904]
We propose a new task denominated Personalized Instance-based Navigation (PIN), in which an embodied agent is tasked with locating and reaching a specific personal object. In each episode, the target object is presented to the agent using two modalities: a set of visual reference images on a neutral background and manually annotated textual descriptions.
arXiv Detail & Related papers (2024-10-23T18:01:09Z)
OpenObject-NAV: Open-Vocabulary Object-Oriented Navigation Based on Dynamic Carrier-Relationship Scene Graph [10.475404599532157]
This paper captures the relationships between frequently used objects and their static carriers. We propose an instance navigation strategy that models the navigation process as a Markov Decision Process. The results demonstrate that by updating the CRSG, the robot can efficiently navigate to moved targets.
arXiv Detail & Related papers (2024-09-27T13:33:52Z)
Aligning Knowledge Graph with Visual Perception for Object-goal Navigation [16.32780793344835]
We propose the Aligning Knowledge Graph with Visual Perception (AKGVP) method for object-goal navigation. Our approach introduces continuous modeling of the hierarchical scene architecture and leverages visual-language pre-training to align natural language description with visual perception. The integration of a continuous knowledge graph architecture and multimodal feature alignment empowers the navigator with a remarkable zero-shot navigation capability.
arXiv Detail & Related papers (2024-02-29T06:31:18Z)
NoMaD: Goal Masked Diffusion Policies for Navigation and Exploration [57.15811390835294]
This paper describes how we can train a single unified diffusion policy to handle both goal-directed navigation and goal-agnostic exploration. We show that this unified policy results in better overall performance when navigating to visually indicated goals in novel environments. Our experiments, conducted on a real-world mobile robot platform, show effective navigation in unseen environments in comparison with five alternative methods.
arXiv Detail & Related papers (2023-10-11T21:07:14Z)
Object Goal Navigation with Recursive Implicit Maps [92.6347010295396]
We propose an implicit spatial map for object goal navigation. Our method significantly outperforms the state of the art on the challenging MP3D dataset. We deploy our model on a real robot and achieve encouraging object goal navigation results in real scenes.
arXiv Detail & Related papers (2023-08-10T14:21:33Z)
Task-Driven Graph Attention for Hierarchical Relational Object Navigation [25.571175038938527]
Embodied AI agents in large scenes often need to navigate to find objects. We study a naturally emerging variant of the object navigation task, hierarchical object navigation (HRON) We propose a solution that uses scene graphs as part of its input and integrates graph neural networks as its backbone.
arXiv Detail & Related papers (2023-06-23T19:50:48Z)
How To Not Train Your Dragon: Training-free Embodied Object Goal Navigation with Semantic Frontiers [94.46825166907831]
We present a training-free solution to tackle the object goal navigation problem in Embodied AI. Our method builds a structured scene representation based on the classic visual simultaneous localization and mapping (V-SLAM) framework. Our method propagates semantics on the scene graphs based on language priors and scene statistics to introduce semantic knowledge to the geometric frontiers.
arXiv Detail & Related papers (2023-05-26T13:38:33Z)
Can an Embodied Agent Find Your "Cat-shaped Mug"? LLM-Guided Exploration for Zero-Shot Object Navigation [58.3480730643517]
We present LGX, a novel algorithm for Language-Driven Zero-Shot Object Goal Navigation (L-ZSON) Our approach makes use of Large Language Models (LLMs) for this task. We achieve state-of-the-art zero-shot object navigation results on RoboTHOR with a success rate (SR) improvement of over 27% over the current baseline.
arXiv Detail & Related papers (2023-03-06T20:19:19Z)
A Contextual Bandit Approach for Learning to Plan in Environments with Probabilistic Goal Configurations [20.15854546504947]
We propose a modular framework for object-nav that is able to efficiently search indoor environments for not just static objects but also movable objects. Our contextual-bandit agent efficiently explores the environment by showing optimism in the face of uncertainty. We evaluate our algorithms in two simulated environments and a real-world setting, to demonstrate high sample efficiency and reliability.
arXiv Detail & Related papers (2022-11-29T15:48:54Z)
Object Memory Transformer for Object Goal Navigation [10.359616364592075]
This paper presents a reinforcement learning method for object goal navigation (Nav) An agent navigates in 3D indoor environments to reach a target object based on long-term observations of objects and scenes. To the best of our knowledge, this is the first work that uses a long-term memory of object semantics in a goal-oriented navigation task.
arXiv Detail & Related papers (2022-03-24T09:16:56Z)
Object Manipulation via Visual Target Localization [64.05939029132394]
Training agents to manipulate objects, poses many challenges. We propose an approach that explores the environment in search for target objects, computes their 3D coordinates once they are located, and then continues to estimate their 3D locations even when the objects are not visible. Our evaluations show a massive 3x improvement in success rate over a model that has access to the same sensory suite.
arXiv Detail & Related papers (2022-03-15T17:59:01Z)
Navigating to Objects in Unseen Environments by Distance Prediction [16.023495311387478]
We propose an object goal navigation framework, which could directly perform path planning based on an estimated distance map. Specifically, our model takes a birds-eye-view semantic map as input, and estimates the distance from the map cells to the target object. With the estimated distance map, the agent could explore the environment and navigate to the target objects based on either human-designed or learned navigation policy.
arXiv Detail & Related papers (2022-02-08T09:22:50Z)
SOON: Scenario Oriented Object Navigation with Graph-based Exploration [102.74649829684617]
The ability to navigate like a human towards a language-guided target from anywhere in a 3D embodied environment is one of the 'holy grail' goals of intelligent robots. Most visual navigation benchmarks focus on navigating toward a target from a fixed starting point, guided by an elaborate set of instructions that depicts step-by-step. This approach deviates from real-world problems in which human-only describes what the object and its surrounding look like and asks the robot to start navigation from anywhere.
arXiv Detail & Related papers (2021-03-31T15:01:04Z)
POMP: Pomcp-based Online Motion Planning for active visual search in indoor environments [89.43830036483901]
We focus on the problem of learning an optimal policy for Active Visual Search (AVS) of objects in known indoor environments with an online setup. Our POMP method uses as input the current pose of an agent and a RGB-D frame. We validate our method on the publicly available AVD benchmark, achieving an average success rate of 0.76 with an average path length of 17.1.
arXiv Detail & Related papers (2020-09-17T08:23:50Z)
Object Goal Navigation using Goal-Oriented Semantic Exploration [98.14078233526476]
This work studies the problem of object goal navigation which involves navigating to an instance of the given object category in unseen environments. We propose a modular system called, Goal-Oriented Semantic Exploration' which builds an episodic semantic map and uses it to explore the environment efficiently.
arXiv Detail & Related papers (2020-07-01T17:52:32Z)
ObjectNav Revisited: On Evaluation of Embodied Agents Navigating to Objects [119.46959413000594]
This document summarizes the consensus recommendations of a working group on ObjectNav. We make recommendations on subtle but important details of evaluation criteria. We provide a detailed description of the instantiation of these recommendations in challenges organized at the Embodied AI workshop at CVPR 2020.
arXiv Detail & Related papers (2020-06-23T17:18:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.