PoCo: Point Context Cluster for RGBD Indoor Place Recognition
- URL: http://arxiv.org/abs/2404.02885v2
- Date: Fri, 30 Aug 2024 20:11:38 GMT
- Title: PoCo: Point Context Cluster for RGBD Indoor Place Recognition
- Authors: Jing Liang, Zhuo Deng, Zheming Zhou, Omid Ghasemalizadeh, Dinesh Manocha, Min Sun, Cheng-Hao Kuo, Arnie Sen,
- Abstract summary: We present a novel end-to-end algorithm (PoCo) for the indoor RGB-D place recognition task, aimed at identifying the most likely match for a given query frame within a reference database.
We propose a new network architecture, which generalizes the recent Context of Clusters (CoCs) to extract global descriptors directly from the noisy point clouds through end-to-end learning.
- Score: 47.12179061883084
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a novel end-to-end algorithm (PoCo) for the indoor RGB-D place recognition task, aimed at identifying the most likely match for a given query frame within a reference database. The task presents inherent challenges attributed to the constrained field of view and limited range of perception sensors. We propose a new network architecture, which generalizes the recent Context of Clusters (CoCs) to extract global descriptors directly from the noisy point clouds through end-to-end learning. Moreover, we develop the architecture by integrating both color and geometric modalities into the point features to enhance the global descriptor representation. We conducted evaluations on public datasets ScanNet-PR and ARKit with 807 and 5047 scenarios, respectively. PoCo achieves SOTA performance: on ScanNet-PR, we achieve R@1 of 64.63%, a 5.7% improvement from the best-published result CGis (61.12%); on Arkit, we achieve R@1 of 45.12%, a 13.3% improvement from the best-published result CGis (39.82%). In addition, PoCo shows higher efficiency than CGis in inference time (1.75X-faster), and we demonstrate the effectiveness of PoCo in recognizing places within a real-world laboratory environment.
Related papers
- Classification of Geographical Land Structure Using Convolution Neural Network and Transfer Learning [1.024113475677323]
This study can produce a set of applications such as urban planning and development, environmental monitoring, disaster management, etc.
This article developed a deep learning-based approach to automate the process of classifying geographical land structures.
arXiv Detail & Related papers (2024-11-19T11:01:30Z) - Handling Geometric Domain Shifts in Semantic Segmentation of Surgical RGB and Hyperspectral Images [67.66644395272075]
We present first analysis of state-of-the-art semantic segmentation models when faced with geometric out-of-distribution data.
We propose an augmentation technique called "Organ Transplantation" to enhance generalizability.
Our augmentation technique improves SOA model performance by up to 67 % for RGB data and 90 % for HSI data, achieving performance at the level of in-distribution performance on real OOD test data.
arXiv Detail & Related papers (2024-08-27T19:13:15Z) - CSCPR: Cross-Source-Context Indoor RGB-D Place Recognition [47.12179061883084]
We present a new algorithm, Cross-Source-Context Place Recognition (CSCPR), for RGB-D indoor place recognition.
Unlike prior approaches that primarily focus on the RGB domain, CSCPR is designed to handle the RGB-D data.
We extend the Context-of-Clusters (CoCs) for handling noisy colorized point clouds and introduce two novel modules for reranking.
arXiv Detail & Related papers (2024-07-24T17:50:00Z) - CGS-Net: Aggregating Colour, Geometry and Semantic Features for
Large-Scale Indoor Place Recognition [6.156387608994791]
We describe an approach to large-scale indoor place recognition that aggregates low-level colour and geometric features with high-level semantic features.
We use a deep learning network that takes in RGB point clouds and extracts local features with five 3-D kernel point convolutional layers.
We specifically train the KPConv layers on the semantic segmentation task to ensure that the extracted local features are semantically meaningful.
arXiv Detail & Related papers (2022-02-04T10:51:25Z) - ZARTS: On Zero-order Optimization for Neural Architecture Search [94.41017048659664]
Differentiable architecture search (DARTS) has been a popular one-shot paradigm for NAS due to its high efficiency.
This work turns to zero-order optimization and proposes a novel NAS scheme, called ZARTS, to search without enforcing the above approximation.
In particular, results on 12 benchmarks verify the outstanding robustness of ZARTS, where the performance of DARTS collapses due to its known instability issue.
arXiv Detail & Related papers (2021-10-10T09:35:15Z) - LoGG3D-Net: Locally Guided Global Descriptor Learning for 3D Place
Recognition [31.105598103211825]
We show that an additional training signal (local consistency loss) can guide the network to learning local features which are consistent across revisits.
We formulate our approach in an end-to-end trainable architecture called LoGG3D-Net.
arXiv Detail & Related papers (2021-09-17T03:32:43Z) - Cyclic Differentiable Architecture Search [99.12381460261841]
Differentiable ARchiTecture Search, i.e., DARTS, has drawn great attention in neural architecture search.
We propose new joint objectives and a novel Cyclic Differentiable ARchiTecture Search framework, dubbed CDARTS.
In the DARTS search space, we achieve 97.52% top-1 accuracy on CIFAR10 and 76.3% top-1 accuracy on ImageNet.
arXiv Detail & Related papers (2020-06-18T17:55:19Z) - Learning Delicate Local Representations for Multi-Person Pose Estimation [77.53144055780423]
We propose a novel method called Residual Steps Network (RSN)
RSN aggregates features with the same spatial size (Intra-level features) efficiently to obtain delicate local representations.
Our approach won the 1st place of COCO Keypoint Challenge 2019.
arXiv Detail & Related papers (2020-03-09T10:40:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.