Polyline Based Generative Navigable Space Segmentation for Autonomous
Visual Navigation
- URL: http://arxiv.org/abs/2111.00063v1
- Date: Fri, 29 Oct 2021 19:50:48 GMT
- Title: Polyline Based Generative Navigable Space Segmentation for Autonomous
Visual Navigation
- Authors: Zheng Chen, Zhengming Ding, David Crandall, Lantao Liu
- Abstract summary: We propose a representation-learning-based framework to enable robots to learn the navigable space segmentation in an unsupervised manner.
We show that the proposed PSV-Nets can learn the visual navigable space with high accuracy, even without any single label.
- Score: 57.3062528453841
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Detecting navigable space is a fundamental capability for mobile robots
navigating in unknown or unmapped environments. In this work, we treat the
visual navigable space segmentation as a scene decomposition problem and
propose Polyline Segmentation Variational AutoEncoder Networks (PSV-Nets), a
representation-learning-based framework to enable robots to learn the navigable
space segmentation in an unsupervised manner. Current segmentation techniques
heavily rely on supervised learning strategies which demand a large amount of
pixel-level annotated images. In contrast, the proposed framework leverages a
generative model - Variational AutoEncoder (VAE) and an AutoEncoder (AE) to
learn a polyline representation that compactly outlines the desired navigable
space boundary in an unsupervised way. We also propose a visual receding
horizon planning method that uses the learned navigable space and a Scaled
Euclidean Distance Field (SEDF) to achieve autonomous navigation without an
explicit map. Through extensive experiments, we have validated that the
proposed PSV-Nets can learn the visual navigable space with high accuracy, even
without any single label. We also show that the prediction of the PSV-Nets can
be further improved with a small number of labels (if available) and can
significantly outperform the state-of-the-art fully supervised-learning-based
segmentation methods.
Related papers
- Learning Navigational Visual Representations with Semantic Map
Supervision [85.91625020847358]
We propose a navigational-specific visual representation learning method by contrasting the agent's egocentric views and semantic maps.
Ego$2$-Map learning transfers the compact and rich information from a map, such as objects, structure and transition, to the agent's egocentric representations for navigation.
arXiv Detail & Related papers (2023-07-23T14:01:05Z) - ViNT: A Foundation Model for Visual Navigation [52.2571739391896]
Visual Navigation Transformer (ViNT) is a foundation model for vision-based robotic navigation.
ViNT is trained with a general goal-reaching objective that can be used with any navigation dataset.
It exhibits positive transfer, outperforming specialist models trained on singular datasets.
arXiv Detail & Related papers (2023-06-26T16:57:03Z) - Rethinking Range View Representation for LiDAR Segmentation [66.73116059734788]
"Many-to-one" mapping, semantic incoherence, and shape deformation are possible impediments against effective learning from range view projections.
We present RangeFormer, a full-cycle framework comprising novel designs across network architecture, data augmentation, and post-processing.
We show that, for the first time, a range view method is able to surpass the point, voxel, and multi-view fusion counterparts in the competing LiDAR semantic and panoptic segmentation benchmarks.
arXiv Detail & Related papers (2023-03-09T16:13:27Z) - Deep Learning Computer Vision Algorithms for Real-time UAVs On-board
Camera Image Processing [77.34726150561087]
This paper describes how advanced deep learning based computer vision algorithms are applied to enable real-time on-board sensor processing for small UAVs.
All algorithms have been developed using state-of-the-art image processing methods based on deep neural networks.
arXiv Detail & Related papers (2022-11-02T11:10:42Z) - Transferring ConvNet Features from Passive to Active Robot
Self-Localization: The Use of Ego-Centric and World-Centric Views [2.362412515574206]
A standard VPR subsystem is assumed to be available, and its domain-invariant state recognition ability is proposed to be transferred to train the domain-invariant NBV planner.
We divide the visual cues that are available from the CNN model into two types: the output layer cue (OLC) and intermediate layer cue (ILC)
In our framework, the ILC and OLC are mapped to a state vector and subsequently used to train a multiview NBV planner via deep reinforcement learning.
arXiv Detail & Related papers (2022-04-22T04:42:33Z) - End-to-End Partially Observable Visual Navigation in a Diverse
Environment [30.895264166384685]
This work aims at three challenges: (i) complex visual observations, (ii) partial observability of local sensing, and (iii) multimodal navigation behaviors.
We propose a novel neural network (NN) architecture to represent a local controller and leverage the flexibility of the end-to-end approach to learn a powerful policy.
We implement the NN controller on the SPOT robot and evaluate it on three challenging tasks with partial observations.
arXiv Detail & Related papers (2021-09-16T06:53:57Z) - Navigation-Oriented Scene Understanding for Robotic Autonomy: Learning
to Segment Driveability in Egocentric Images [25.350677396144075]
This work tackles scene understanding for outdoor robotic navigation, solely relying on images captured by an on-board camera.
We segment egocentric images directly in terms of how a robot can navigate in them, and tailor the learning problem to an autonomous navigation task.
We present a generic and scalable affordance-based definition consisting of 3 driveability levels which can be applied to arbitrary scenes.
arXiv Detail & Related papers (2021-09-15T12:25:56Z) - Learning Synthetic to Real Transfer for Localization and Navigational
Tasks [7.019683407682642]
Navigation is at the crossroad of multiple disciplines, it combines notions of computer vision, robotics and control.
This work aimed at creating, in a simulation, a navigation pipeline whose transfer to the real world could be done with as few efforts as possible.
To design the navigation pipeline four main challenges arise; environment, localization, navigation and planning.
arXiv Detail & Related papers (2020-11-20T08:37:03Z) - Improving Point Cloud Semantic Segmentation by Learning 3D Object
Detection [102.62963605429508]
Point cloud semantic segmentation plays an essential role in autonomous driving.
Current 3D semantic segmentation networks focus on convolutional architectures that perform great for well represented classes.
We propose a novel Aware 3D Semantic Detection (DASS) framework that explicitly leverages localization features from an auxiliary 3D object detection task.
arXiv Detail & Related papers (2020-09-22T14:17:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.