AEGIS-Net: Attention-guided Multi-Level Feature Aggregation for Indoor
Place Recognition
- URL: http://arxiv.org/abs/2312.09538v1
- Date: Fri, 15 Dec 2023 05:09:08 GMT
- Title: AEGIS-Net: Attention-guided Multi-Level Feature Aggregation for Indoor
Place Recognition
- Authors: Yuhang Ming, Jian Ma, Xingrui Yang, Weichen Dai, Yong Peng, Wanzeng
Kong
- Abstract summary: AEGIS-Net is a novel indoor place recognition model that takes in RGB point clouds and generates global place descriptors.
Our AEGIS-Net is made of a semantic encoder, a semantic decoder and an attention-guided feature embedding.
We evaluate our AEGIS-Net on the ScanNetPR dataset and compare its performance with a pre-deep-learning feature-based method and five state-of-the-art deep-learning-based methods.
- Score: 12.728087388529028
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present AEGIS-Net, a novel indoor place recognition model that takes in
RGB point clouds and generates global place descriptors by aggregating
lower-level color, geometry features and higher-level implicit semantic
features. However, rather than simple feature concatenation, self-attention
modules are employed to select the most important local features that best
describe an indoor place. Our AEGIS-Net is made of a semantic encoder, a
semantic decoder and an attention-guided feature embedding. The model is
trained in a 2-stage process with the first stage focusing on an auxiliary
semantic segmentation task and the second one on the place recognition task. We
evaluate our AEGIS-Net on the ScanNetPR dataset and compare its performance
with a pre-deep-learning feature-based method and five state-of-the-art
deep-learning-based methods. Our AEGIS-Net achieves exceptional performance and
outperforms all six methods.
Related papers
- Global Attention-Guided Dual-Domain Point Cloud Feature Learning for Classification and Segmentation [21.421806351869552]
We propose a Global Attention-guided Dual-domain Feature Learning network (GAD) to address the above-mentioned issues.
We first devise the Contextual Position-enhanced Transformer (CPT) module, which is armed with an improved global attention mechanism.
Then, the Dual-domain K-nearest neighbor Feature Fusion (DKFF) is cascaded to conduct effective feature aggregation.
arXiv Detail & Related papers (2024-07-12T05:19:19Z) - Exploiting Object-based and Segmentation-based Semantic Features for Deep Learning-based Indoor Scene Classification [0.5572976467442564]
The work described in this paper uses both semantic information, obtained from object detection, and semantic segmentation techniques.
A novel approach that uses a semantic segmentation mask to provide Hu-moments-based segmentation categories' shape characterization, designated by Hu-Moments Features (SHMFs) is proposed.
A three-main-branch network, designated by GOS$2$F$2$App, that exploits deep-learning-based global features, object-based features, and semantic segmentation-based features is also proposed.
arXiv Detail & Related papers (2024-04-11T13:37:51Z) - Auxiliary Tasks Enhanced Dual-affinity Learning for Weakly Supervised
Semantic Segmentation [79.05949524349005]
We propose AuxSegNet+, a weakly supervised auxiliary learning framework to explore the rich information from saliency maps.
We also propose a cross-task affinity learning mechanism to learn pixel-level affinities from the saliency and segmentation feature maps.
arXiv Detail & Related papers (2024-03-02T10:03:21Z) - Segment Anything Model is a Good Teacher for Local Feature Learning [19.66262816561457]
Local feature detection and description play an important role in many computer vision tasks.
Data-driven local feature learning methods need to rely on pixel-level correspondence for training.
We propose SAMFeat to introduce SAM as a teacher to guide local feature learning.
arXiv Detail & Related papers (2023-09-29T05:29:20Z) - Weakly-supervised Contrastive Learning for Unsupervised Object Discovery [52.696041556640516]
Unsupervised object discovery is promising due to its ability to discover objects in a generic manner.
We design a semantic-guided self-supervised learning model to extract high-level semantic features from images.
We introduce Principal Component Analysis (PCA) to localize object regions.
arXiv Detail & Related papers (2023-07-07T04:03:48Z) - Saliency Guided Inter- and Intra-Class Relation Constraints for Weakly
Supervised Semantic Segmentation [66.87777732230884]
We propose a saliency guided Inter- and Intra-Class Relation Constrained (I$2$CRC) framework to assist the expansion of the activated object regions.
We also introduce an object guided label refinement module to take a full use of both the segmentation prediction and the initial labels for obtaining superior pseudo-labels.
arXiv Detail & Related papers (2022-06-20T03:40:56Z) - CGS-Net: Aggregating Colour, Geometry and Semantic Features for
Large-Scale Indoor Place Recognition [6.156387608994791]
We describe an approach to large-scale indoor place recognition that aggregates low-level colour and geometric features with high-level semantic features.
We use a deep learning network that takes in RGB point clouds and extracts local features with five 3-D kernel point convolutional layers.
We specifically train the KPConv layers on the semantic segmentation task to ensure that the extracted local features are semantically meaningful.
arXiv Detail & Related papers (2022-02-04T10:51:25Z) - Learning Semantics for Visual Place Recognition through Multi-Scale
Attention [14.738954189759156]
We present the first VPR algorithm that learns robust global embeddings from both visual appearance and semantic content of the data.
Experiments on various scenarios validate this new approach and demonstrate its performance against state-of-the-art methods.
arXiv Detail & Related papers (2022-01-24T14:13:12Z) - SOE-Net: A Self-Attention and Orientation Encoding Network for Point
Cloud based Place Recognition [50.9889997200743]
We tackle the problem of place recognition from point cloud data with a self-attention and orientation encoding network (SOE-Net)
SOE-Net fully explores the relationship between points and incorporates long-range context into point-wise local descriptors.
Experiments on various benchmark datasets demonstrate superior performance of the proposed network over the current state-of-the-art approaches.
arXiv Detail & Related papers (2020-11-24T22:28:25Z) - Global Context-Aware Progressive Aggregation Network for Salient Object
Detection [117.943116761278]
We propose a novel network named GCPANet to integrate low-level appearance features, high-level semantic features, and global context features.
We show that the proposed approach outperforms the state-of-the-art methods both quantitatively and qualitatively.
arXiv Detail & Related papers (2020-03-02T04:26:10Z) - SceneEncoder: Scene-Aware Semantic Segmentation of Point Clouds with A
Learnable Scene Descriptor [51.298760338410624]
We propose a SceneEncoder module to impose a scene-aware guidance to enhance the effect of global information.
The module predicts a scene descriptor, which learns to represent the categories of objects existing in the scene.
We also design a region similarity loss to propagate distinguishing features to their own neighboring points with the same label.
arXiv Detail & Related papers (2020-01-24T16:53:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.