Related papers: Learning a Compact State Representation for Navigation Tasks by Autoencoding 2D-Lidar Scans

Learning a Compact State Representation for Navigation Tasks by Autoencoding 2D-Lidar Scans

URL: http://arxiv.org/abs/2102.02127v1
Date: Wed, 3 Feb 2021 16:10:26 GMT
Title: Learning a Compact State Representation for Navigation Tasks by Autoencoding 2D-Lidar Scans
Authors: Christopher Gebauer and Maren Bennewitz
Abstract summary: We generate a compact representation of 2D-lidar scans for reinforcement learning in navigation tasks. In particular, we incorporate the relation of consecutive scans, especially ego-motion, by applying a memory model. Experiments show the capability of our approach to highly compress lidar data, maintain a meaningful distribution of the latent space, and even incorporate time-depended information.
Score: 7.99536002595393
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we address the problem of generating a compact representation of 2D-lidar scans for reinforcement learning in navigation tasks. By now only little work focuses on the compactness of the provided state, which is a necessary condition to successfully and efficiently train a navigation agent. Our approach works in three stages. First, we propose a novel preprocessing of the distance measurements and compute a local, egocentric, binary grid map based on the current range measurements. We then autoencode the local map using a variational autoencoder, where the latent space serves as state representation. An important key for a compact and, at the same time, meaningful representation is the degree of disentanglement, which describes the correlation between each latent dimension. Therefore, we finally apply state-of-the-art disentangling methods to improve the representation power. Furthermore, we investige the possibilities of incorporating time-dependent information into the latent space. In particular, we incorporate the relation of consecutive scans, especially ego-motion, by applying a memory model. We implemented our approach in python using tensorflow. Our datasets are simulated with pybullet as well as recorded using a slamtec rplidar A3. The experiments show the capability of our approach to highly compress lidar data, maintain a meaningful distribution of the latent space, and even incorporate time-depended information.

Related papers

Efficient Masked AutoEncoder for Video Object Counting and A Large-Scale Benchmark [52.339936954958034]
The dynamic imbalance of the fore-background is a major challenge in video object counting. We propose a density-embedded Efficient Masked Autoencoder Counting (E-MAC) framework in this paper. In addition, we first propose a large video bird counting dataset, DroneBird, in natural scenarios for migratory bird protection.
arXiv Detail & Related papers (2024-11-20T06:08:21Z)
FASTC: A Fast Attentional Framework for Semantic Traversability Classification Using Point Cloud [7.711666704468952]
We address the problem of traversability assessment using point clouds. We propose a pillar feature extraction module that utilizes PointNet to capture features from point clouds organized in vertical volume. We then propose a newtemporal attention module to fuse multi-frame information, which can properly handle the varying density problem of LIDAR point clouds.
arXiv Detail & Related papers (2024-06-24T12:01:55Z)
Coupled Laplacian Eigenmaps for Locally-Aware 3D Rigid Point Cloud Matching [0.0]
We propose a new technique, based on graph Laplacian eigenmaps, to match point clouds by taking into account fine local structures. To deal with the order and sign ambiguity of Laplacian eigenmaps, we introduce a new operator, called Coupled Laplacian. We show that the similarity between those aligned high-dimensional spaces provides a locally meaningful score to match shapes.
arXiv Detail & Related papers (2024-02-27T10:10:12Z)
VoxelKP: A Voxel-based Network Architecture for Human Keypoint Estimation in LiDAR Data [53.638818890966036]
textitVoxelKP is a novel fully sparse network architecture tailored for human keypoint estimation in LiDAR data. We introduce sparse box-attention to focus on learning spatial correlations between keypoints within each human instance. We incorporate a spatial encoding to leverage absolute 3D coordinates when projecting 3D voxels to a 2D grid encoding a bird's eye view.
arXiv Detail & Related papers (2023-12-11T23:50:14Z)
UnLoc: A Universal Localization Method for Autonomous Vehicles using LiDAR, Radar and/or Camera Input [51.150605800173366]
UnLoc is a novel unified neural modeling approach for localization with multi-sensor input in all weather conditions. Our method is extensively evaluated on Oxford Radar RobotCar, ApolloSouthBay and Perth-WA datasets.
arXiv Detail & Related papers (2023-07-03T04:10:55Z)
A Unified BEV Model for Joint Learning of 3D Local Features and Overlap Estimation [12.499361832561634]
We present a unified bird's-eye view (BEV) model for jointly learning of 3D local features and overlap estimation. Our method significantly outperforms existing methods on overlap prediction, especially in scenes with small overlaps.
arXiv Detail & Related papers (2023-02-28T12:01:16Z)
Learning Implicit Feature Alignment Function for Semantic Segmentation [51.36809814890326]
Implicit Feature Alignment function (IFA) is inspired by the rapidly expanding topic of implicit neural representations. We show that IFA implicitly aligns the feature maps at different levels and is capable of producing segmentation maps in arbitrary resolutions. Our method can be combined with improvement on various architectures, and it achieves state-of-the-art accuracy trade-off on common benchmarks.
arXiv Detail & Related papers (2022-06-17T09:40:14Z)
Focal Sparse Convolutional Networks for 3D Object Detection [121.45950754511021]
We introduce two new modules to enhance the capability of Sparse CNNs. They are focal sparse convolution (Focals Conv) and its multi-modal variant of focal sparse convolution with fusion. For the first time, we show that spatially learnable sparsity in sparse convolution is essential for sophisticated 3D object detection.
arXiv Detail & Related papers (2022-04-26T17:34:10Z)
Energy networks for state estimation with random sensors using sparse labels [0.0]
We propose a technique with an implicit optimization layer and a physics-based loss function that can learn from sparse labels. Based on this technique we present two models for discrete and continuous prediction in space.
arXiv Detail & Related papers (2022-03-12T15:15:38Z)
Learning Optical Flow from a Few Matches [67.83633948984954]
We show that the dense correlation volume representation is redundant and accurate flow estimation can be achieved with only a fraction of elements in it. Experiments show that our method can reduce computational cost and memory use significantly, while maintaining high accuracy.
arXiv Detail & Related papers (2021-04-05T21:44:00Z)
Efficient Spatialtemporal Context Modeling for Action Recognition [42.30158166919919]
We propose a recurrent 3D criss-cross attention (RCCA-3D) module to model the dense long-range contextual information video for action recognition. We model the relationship between points in the same line along the direction of horizon, vertical and depth at each time, which forms a 3D criss-cross structure. Compared with the non-local method, the proposed RCCA-3D module reduces the number of parameters and FLOPs by 25% and 11% for the video context modeling.
arXiv Detail & Related papers (2021-03-20T14:48:12Z)
D3Feat: Joint Learning of Dense Detection and Description of 3D Local Features [51.04841465193678]
We leverage a 3D fully convolutional network for 3D point clouds. We propose a novel and practical learning mechanism that densely predicts both a detection score and a description feature for each 3D point. Our method achieves state-of-the-art results in both indoor and outdoor scenarios.
arXiv Detail & Related papers (2020-03-06T12:51:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.