CSCPR: Cross-Source-Context Indoor RGB-D Place Recognition
- URL: http://arxiv.org/abs/2407.17457v1
- Date: Wed, 24 Jul 2024 17:50:00 GMT
- Title: CSCPR: Cross-Source-Context Indoor RGB-D Place Recognition
- Authors: Jing Liang, Zhuo Deng, Zheming Zhou, Min Sun, Omid Ghasemalizadeh, Cheng-Hao Kuo, Arnie Sen, Dinesh Manocha,
- Abstract summary: We present a new algorithm, Cross-Source-Context Place Recognition (CSCPR), for RGB-D indoor place recognition.
Unlike prior approaches that primarily focus on the RGB domain, CSCPR is designed to handle the RGB-D data.
We extend the Context-of-Clusters (CoCs) for handling noisy colorized point clouds and introduce two novel modules for reranking.
- Score: 47.12179061883084
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a new algorithm, Cross-Source-Context Place Recognition (CSCPR), for RGB-D indoor place recognition that integrates global retrieval and reranking into a single end-to-end model. Unlike prior approaches that primarily focus on the RGB domain, CSCPR is designed to handle the RGB-D data. We extend the Context-of-Clusters (CoCs) for handling noisy colorized point clouds and introduce two novel modules for reranking: the Self-Context Cluster (SCC) and Cross Source Context Cluster (CSCC), which enhance feature representation and match query-database pairs based on local features, respectively. We also present two new datasets, ScanNetIPR and ARKitIPR. Our experiments demonstrate that CSCPR significantly outperforms state-of-the-art models on these datasets by at least 36.5% in Recall@1 at ScanNet-PR dataset and 44% in new datasets. Code and datasets will be released.
Related papers
- RGB-Sonar Tracking Benchmark and Spatial Cross-Attention Transformer Tracker [4.235252053339947]
This paper introduces a new challenging RGB-Sonar (RGB-S) tracking task.
It investigates how to achieve efficient tracking of an underwater target through the interaction of RGB and sonar modalities.
arXiv Detail & Related papers (2024-06-11T12:01:11Z) - PoCo: Point Context Cluster for RGBD Indoor Place Recognition [47.12179061883084]
We present a novel end-to-end algorithm (PoCo) for the indoor RGB-D place recognition task, aimed at identifying the most likely match for a given query frame within a reference database.
We propose a new network architecture, which generalizes the recent Context of Clusters (CoCs) to extract global descriptors directly from the noisy point clouds through end-to-end learning.
arXiv Detail & Related papers (2024-04-03T17:38:15Z) - Benchmarking Implicit Neural Representation and Geometric Rendering in Real-Time RGB-D SLAM [6.242958695705305]
Implicit neural representation (INR) in combination with geometric rendering has been employed in real-time dense RGB-D SLAM.
We establish the first open-source benchmark framework to evaluate the performance of a wide spectrum of commonly used INRs and rendering functions.
We propose explicit hybrid encoding for high-fidelity dense grid mapping to comply with the RGB-D SLAM system.
arXiv Detail & Related papers (2024-03-28T14:59:56Z) - CalibNet: Dual-branch Cross-modal Calibration for RGB-D Salient Instance Segmentation [88.50067783122559]
CalibNet consists of three simple modules, a dynamic interactive kernel (DIK) and a weight-sharing fusion (WSF)
Experiments show that CalibNet yields a promising result, i.e., 58.0% AP with 320*480 input size on the COME15K-N test set.
arXiv Detail & Related papers (2023-07-16T16:49:59Z) - Revisiting Color-Event based Tracking: A Unified Network, Dataset, and
Metric [53.88188265943762]
We propose a single-stage backbone network for Color-Event Unified Tracking (CEUTrack), which achieves the above functions simultaneously.
Our proposed CEUTrack is simple, effective, and efficient, which achieves over 75 FPS and new SOTA performance.
arXiv Detail & Related papers (2022-11-20T16:01:31Z) - CRCNet: Few-shot Segmentation with Cross-Reference and Region-Global
Conditional Networks [59.85183776573642]
Few-shot segmentation aims to learn a segmentation model that can be generalized to novel classes with only a few training images.
We propose a Cross-Reference and Local-Global Networks (CRCNet) for few-shot segmentation.
Our network can better find the co-occurrent objects in the two images with a cross-reference mechanism.
arXiv Detail & Related papers (2022-08-23T06:46:18Z) - Self-Supervised Representation Learning for RGB-D Salient Object
Detection [93.17479956795862]
We use Self-Supervised Representation Learning to design two pretext tasks: the cross-modal auto-encoder and the depth-contour estimation.
Our pretext tasks require only a few and un RGB-D datasets to perform pre-training, which make the network capture rich semantic contexts.
For the inherent problem of cross-modal fusion in RGB-D SOD, we propose a multi-path fusion module.
arXiv Detail & Related papers (2021-01-29T09:16:06Z) - Bi-directional Cross-Modality Feature Propagation with
Separation-and-Aggregation Gate for RGB-D Semantic Segmentation [59.94819184452694]
Depth information has proven to be a useful cue in the semantic segmentation of RGBD images for providing a geometric counterpart to the RGB representation.
Most existing works simply assume that depth measurements are accurate and well-aligned with the RGB pixels and models the problem as a cross-modal feature fusion.
In this paper, we propose a unified and efficient Crossmodality Guided to not only effectively recalibrate RGB feature responses, but also to distill accurate depth information via multiple stages and aggregate the two recalibrated representations alternatively.
arXiv Detail & Related papers (2020-07-17T18:35:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.