Related papers: Sparse Image based Navigation Architecture to Mitigate the need of precise Localization in Mobile Robots

Sparse Image based Navigation Architecture to Mitigate the need of precise Localization in Mobile Robots

URL: http://arxiv.org/abs/2203.15272v1
Date: Tue, 29 Mar 2022 06:38:18 GMT
Title: Sparse Image based Navigation Architecture to Mitigate the need of precise Localization in Mobile Robots
Authors: Pranay Mathur, Rajesh Kumar, Sarthak Upadhyay
Abstract summary: This paper focuses on mitigating the need for exact localization of a mobile robot to pursue autonomous navigation using a sparse set of images. The proposed method consists of a model architecture - RoomNet, for unsupervised learning resulting in a coarse identification of the environment. The latter uses sparse image matching to characterise the similarity of frames achieved vis-a-vis the frames viewed by the robot during the mapping and training stage.
Score: 3.1556608426768324
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Traditional simultaneous localization and mapping (SLAM) methods focus on improvement in the robot's localization under environment and sensor uncertainty. This paper, however, focuses on mitigating the need for exact localization of a mobile robot to pursue autonomous navigation using a sparse set of images. The proposed method consists of a model architecture - RoomNet, for unsupervised learning resulting in a coarse identification of the environment and a separate local navigation policy for local identification and navigation. The former learns and predicts the scene based on the short term image sequences seen by the robot along with the transition image scenarios using long term image sequences. The latter uses sparse image matching to characterise the similarity of frames achieved vis-a-vis the frames viewed by the robot during the mapping and training stage. A sparse graph of the image sequence is created which is then used to carry out robust navigation purely on the basis of visual goals. The proposed approach is evaluated on two robots in a test environment and demonstrates the ability to navigate in dynamic environments where landmarks are obscured and classical localization methods fail.

Related papers

Watch Your STEPP: Semantic Traversability Estimation using Pose Projected Features [4.392942391043664]
We propose a method for estimating terrain traversability by learning from demonstrations of human walking. Our approach leverages dense, pixel-wise feature embeddings generated using the DINOv2 vision Transformer model. By minimizing loss, the network distinguishes between familiar terrain with a low reconstruction error and unfamiliar or hazardous terrain with a higher reconstruction error.
arXiv Detail & Related papers (2025-01-29T11:53:58Z)
Learning Where to Look: Self-supervised Viewpoint Selection for Active Localization using Geometrical Information [68.10033984296247]
This paper explores the domain of active localization, emphasizing the importance of viewpoint selection to enhance localization accuracy. Our contributions involve using a data-driven approach with a simple architecture designed for real-time operation, a self-supervised data training method, and the capability to consistently integrate our map into a planning framework tailored for real-world robotics applications.
arXiv Detail & Related papers (2024-07-22T12:32:09Z)
Mapping High-level Semantic Regions in Indoor Environments without Object Recognition [50.624970503498226]
The present work proposes a method for semantic region mapping via embodied navigation in indoor environments. To enable region identification, the method uses a vision-to-language model to provide scene information for mapping. By projecting egocentric scene understanding into the global frame, the proposed method generates a semantic map as a distribution over possible region labels at each location.
arXiv Detail & Related papers (2024-03-11T18:09:50Z)
Interactive Semantic Map Representation for Skill-based Visual Object Navigation [43.71312386938849]
This paper introduces a new representation of a scene semantic map formed during the embodied agent interaction with the indoor environment. We have implemented this representation into a full-fledged navigation approach called SkillTron. The proposed approach makes it possible to form both intermediate goals for robot exploration and the final goal for object navigation.
arXiv Detail & Related papers (2023-11-07T16:30:12Z)
NoMaD: Goal Masked Diffusion Policies for Navigation and Exploration [57.15811390835294]
This paper describes how we can train a single unified diffusion policy to handle both goal-directed navigation and goal-agnostic exploration. We show that this unified policy results in better overall performance when navigating to visually indicated goals in novel environments. Our experiments, conducted on a real-world mobile robot platform, show effective navigation in unseen environments in comparison with five alternative methods.
arXiv Detail & Related papers (2023-10-11T21:07:14Z)
Location-Aware Self-Supervised Transformers [74.76585889813207]
We propose to pretrain networks for semantic segmentation by predicting the relative location of image parts. We control the difficulty of the task by masking a subset of the reference patch features visible to those of the query. Our experiments show that this location-aware pretraining leads to representations that transfer competitively to several challenging semantic segmentation benchmarks.
arXiv Detail & Related papers (2022-12-05T16:24:29Z)
UNav: An Infrastructure-Independent Vision-Based Navigation System for People with Blindness and Low vision [4.128685217530067]
We propose a vision-based localization pipeline for navigation support for end-users with blindness and low vision. Given a query image taken by an end-user on a mobile application, the pipeline leverages a visual place recognition (VPR) algorithm to find similar images in a reference image database. A customized user interface projects a 3D reconstructed sparse map, built from a sequence of images, to the corresponding a priori 2D floor plan.
arXiv Detail & Related papers (2022-09-22T22:21:37Z)
Semantic Image Alignment for Vehicle Localization [111.59616433224662]
We present a novel approach to vehicle localization in dense semantic maps using semantic segmentation from a monocular camera. In contrast to existing visual localization approaches, the system does not require additional keypoint features, handcrafted localization landmark extractors or expensive LiDAR sensors.
arXiv Detail & Related papers (2021-10-08T14:40:15Z)
SeanNet: Semantic Understanding Network for Localization Under Object Dynamics [14.936899865448892]
Under the object-level scene dynamics induced by human daily activities, a robot needs to robustly localize itself in the environment. Previous works have addressed visual-based localization in static environments, yet the object-level scene dynamics challenge existing methods on long-term deployment of the robot. This paper proposes SEmantic understANding Network (SeanNet) that enables robots to measure the similarity between two scenes on both visual and semantic aspects.
arXiv Detail & Related papers (2021-10-05T18:29:07Z)
Navigation-Oriented Scene Understanding for Robotic Autonomy: Learning to Segment Driveability in Egocentric Images [25.350677396144075]
This work tackles scene understanding for outdoor robotic navigation, solely relying on images captured by an on-board camera. We segment egocentric images directly in terms of how a robot can navigate in them, and tailor the learning problem to an autonomous navigation task. We present a generic and scalable affordance-based definition consisting of 3 driveability levels which can be applied to arbitrary scenes.
arXiv Detail & Related papers (2021-09-15T12:25:56Z)
Cross-Descriptor Visual Localization and Mapping [81.16435356103133]
Visual localization and mapping is the key technology underlying the majority of Mixed Reality and robotics systems. We present three novel scenarios for localization and mapping which require the continuous update of feature representations. Our data-driven approach is agnostic to the feature descriptor type, has low computational requirements, and scales linearly with the number of description algorithms.
arXiv Detail & Related papers (2020-12-02T18:19:51Z)
Spatial Action Maps for Mobile Manipulation [30.018835572458844]
We show that it can be advantageous to learn with dense action representations defined in the same domain as the state. We present "spatial action maps," in which the set of possible actions is represented by a pixel map. We find that policies learned with spatial action maps achieve much better performance than traditional alternatives.
arXiv Detail & Related papers (2020-04-20T09:06:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.