LCD -- Line Clustering and Description for Place Recognition
- URL: http://arxiv.org/abs/2010.10867v1
- Date: Wed, 21 Oct 2020 09:52:47 GMT
- Title: LCD -- Line Clustering and Description for Place Recognition
- Authors: Felix Taubner, Florian Tschopp, Tonci Novkovic, Roland Siegwart, Fadri
Furrer
- Abstract summary: We introduce a novel learning-based approach to place recognition, using RGB-D cameras and line clusters as visual and geometric features.
We present a neural network architecture based on the attention mechanism for frame-wise line clustering.
A similar neural network is used for the description of these clusters with a compact embedding of 128 floating point numbers.
- Score: 29.053923938306323
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Current research on visual place recognition mostly focuses on aggregating
local visual features of an image into a single vector representation.
Therefore, high-level information such as the geometric arrangement of the
features is typically lost. In this paper, we introduce a novel learning-based
approach to place recognition, using RGB-D cameras and line clusters as visual
and geometric features. We state the place recognition problem as a problem of
recognizing clusters of lines instead of individual patches, thus maintaining
structural information. In our work, line clusters are defined as lines that
make up individual objects, hence our place recognition approach can be
understood as object recognition. 3D line segments are detected in RGB-D images
using state-of-the-art techniques. We present a neural network architecture
based on the attention mechanism for frame-wise line clustering. A similar
neural network is used for the description of these clusters with a compact
embedding of 128 floating point numbers, trained with triplet loss on training
data obtained from the InteriorNet dataset. We show experiments on a large
number of indoor scenes and compare our method with the bag-of-words
image-retrieval approach using SIFT and SuperPoint features and the global
descriptor NetVLAD. Trained only on synthetic data, our approach generalizes
well to real-world data captured with Kinect sensors, while also providing
information about the geometric arrangement of instances.
Related papers
- Monocular Visual Place Recognition in LiDAR Maps via Cross-Modal State Space Model and Multi-View Matching [2.400446821380503]
We introduce an efficient framework to learn descriptors for both RGB images and point clouds.
It takes visual state space model (VMamba) as the backbone and employs a pixel-view-scene joint training strategy.
A visible 3D points overlap strategy is then designed to quantify the similarity between point cloud views and RGB images for multi-view supervision.
arXiv Detail & Related papers (2024-10-08T18:31:41Z) - Anyview: Generalizable Indoor 3D Object Detection with Variable Frames [63.51422844333147]
We present a novel 3D detection framework named AnyView for our practical applications.
Our method achieves both great generalizability and high detection accuracy with a simple and clean architecture.
arXiv Detail & Related papers (2023-10-09T02:15:45Z) - SeMLaPS: Real-time Semantic Mapping with Latent Prior Networks and
Quasi-Planar Segmentation [53.83313235792596]
We present a new methodology for real-time semantic mapping from RGB-D sequences.
It combines a 2D neural network and a 3D network based on a SLAM system with 3D occupancy mapping.
Our system achieves state-of-the-art semantic mapping quality within 2D-3D networks-based systems.
arXiv Detail & Related papers (2023-06-28T22:36:44Z) - Neural Implicit Dense Semantic SLAM [83.04331351572277]
We propose a novel RGBD vSLAM algorithm that learns a memory-efficient, dense 3D geometry, and semantic segmentation of an indoor scene in an online manner.
Our pipeline combines classical 3D vision-based tracking and loop closing with neural fields-based mapping.
Our proposed algorithm can greatly enhance scene perception and assist with a range of robot control problems.
arXiv Detail & Related papers (2023-04-27T23:03:52Z) - Flattening-Net: Deep Regular 2D Representation for 3D Point Cloud
Analysis [66.49788145564004]
We present an unsupervised deep neural architecture called Flattening-Net to represent irregular 3D point clouds of arbitrary geometry and topology.
Our methods perform favorably against the current state-of-the-art competitors.
arXiv Detail & Related papers (2022-12-17T15:05:25Z) - PointResNet: Residual Network for 3D Point Cloud Segmentation and
Classification [18.466814193413487]
Point cloud segmentation and classification are some of the primary tasks in 3D computer vision.
In this paper, we propose PointResNet, a residual block-based approach.
Our model directly processes the 3D points, using a deep neural network for the segmentation and classification tasks.
arXiv Detail & Related papers (2022-11-20T17:39:48Z) - Two-Stream Graph Convolutional Network for Intra-oral Scanner Image
Segmentation [133.02190910009384]
We propose a two-stream graph convolutional network (i.e., TSGCN) to handle inter-view confusion between different raw attributes.
Our TSGCN significantly outperforms state-of-the-art methods in 3D tooth (surface) segmentation.
arXiv Detail & Related papers (2022-04-19T10:41:09Z) - DenseGAP: Graph-Structured Dense Correspondence Learning with Anchor
Points [15.953570826460869]
Establishing dense correspondence between two images is a fundamental computer vision problem.
We introduce DenseGAP, a new solution for efficient Dense correspondence learning with a Graph-structured neural network conditioned on Anchor Points.
Our method advances the state-of-the-art of correspondence learning on most benchmarks.
arXiv Detail & Related papers (2021-12-13T18:59:30Z) - Towards Dense People Detection with Deep Learning and Depth images [9.376814409561726]
This paper proposes a DNN-based system that detects multiple people from a single depth image.
Our neural network processes a depth image and outputs a likelihood map in image coordinates.
We show this strategy to be effective, producing networks that generalize to work with scenes different from those used during training.
arXiv Detail & Related papers (2020-07-14T16:43:02Z) - High-Order Information Matters: Learning Relation and Topology for
Occluded Person Re-Identification [84.43394420267794]
We propose a novel framework by learning high-order relation and topology information for discriminative features and robust alignment.
Our framework significantly outperforms state-of-the-art by6.5%mAP scores on Occluded-Duke dataset.
arXiv Detail & Related papers (2020-03-18T12:18:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.