Semantic SuperPoint: A Deep Semantic Descriptor
- URL: http://arxiv.org/abs/2211.01098v1
- Date: Wed, 2 Nov 2022 13:17:04 GMT
- Title: Semantic SuperPoint: A Deep Semantic Descriptor
- Authors: Gabriel S. Gama, N\'icolas S. Rosa and Valdir Grassi Jr
- Abstract summary: We propose that adding a semantic segmentation decoder in a shared encoder architecture would help the descriptor decoder learn semantic information.
The proposed models are evaluated according to detection and matching metrics on the HPatches dataset.
- Score: 2.1362576987263955
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Several SLAM methods benefit from the use of semantic information. Most
integrate photometric methods with high-level semantics such as object
detection and semantic segmentation. We propose that adding a semantic
segmentation decoder in a shared encoder architecture would help the descriptor
decoder learn semantic information, improving the feature extractor. This would
be a more robust approach than only using high-level semantic information since
it would be intrinsically learned in the descriptor and would not depend on the
final quality of the semantic prediction. To add this information, we take
advantage of multi-task learning methods to improve accuracy and balance the
performance of each task. The proposed models are evaluated according to
detection and matching metrics on the HPatches dataset. The results show that
the Semantic SuperPoint model performs better than the baseline one.
Related papers
- Semantic-aware Representation Learning for Homography Estimation [28.70450397793246]
We propose SRMatcher, a detector-free feature matching method, which encourages the network to learn integrated semantic feature representation.
By reducing errors stemming from semantic inconsistencies in matching pairs, our proposed SRMatcher is able to deliver more accurate and realistic outcomes.
arXiv Detail & Related papers (2024-07-18T08:36:28Z) - FM-Fusion: Instance-aware Semantic Mapping Boosted by Vision-Language Foundation Models [24.77953131753715]
Development of vision-language foundation models demonstrates a strong zero-shot transferability across data distribution.
We propose a probabilistic label fusion method to predict close-set semantic classes from open-set label measurements.
We integrate all the modules into a unified semantic mapping system. Reading a sequence of RGB-D input, our work incrementally reconstructs an instance-aware semantic map.
arXiv Detail & Related papers (2024-02-07T03:19:02Z) - Learning Semantic Segmentation with Query Points Supervision on Aerial Images [57.09251327650334]
We present a weakly supervised learning algorithm to train semantic segmentation algorithms.
Our proposed approach performs accurate semantic segmentation and improves efficiency by significantly reducing the cost and time required for manual annotation.
arXiv Detail & Related papers (2023-09-11T14:32:04Z) - Learning Context-aware Classifier for Semantic Segmentation [88.88198210948426]
In this paper, contextual hints are exploited via learning a context-aware classifier.
Our method is model-agnostic and can be easily applied to generic segmentation models.
With only negligible additional parameters and +2% inference time, decent performance gain has been achieved on both small and large models.
arXiv Detail & Related papers (2023-03-21T07:00:35Z) - ALSO: Automotive Lidar Self-supervision by Occupancy estimation [70.70557577874155]
We propose a new self-supervised method for pre-training the backbone of deep perception models operating on point clouds.
The core idea is to train the model on a pretext task which is the reconstruction of the surface on which the 3D points are sampled.
The intuition is that if the network is able to reconstruct the scene surface, given only sparse input points, then it probably also captures some fragments of semantic information.
arXiv Detail & Related papers (2022-12-12T13:10:19Z) - SARNet: Semantic Augmented Registration of Large-Scale Urban Point
Clouds [19.41446935340719]
We propose SARNet, a novel semantic augmented registration network for urban point clouds.
Our approach fully exploits semantic features as assistance to improve registration accuracy.
We evaluate the proposed SARNet extensively by using real-world data from large regions of urban scenes.
arXiv Detail & Related papers (2022-06-27T08:49:11Z) - TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic
Segmentation [44.75300205362518]
Unsupervised semantic segmentation aims to obtain high-level semantic representation on low-level visual features without manual annotations.
We propose the first top-down unsupervised semantic segmentation framework for fine-grained segmentation in extremely complicated scenarios.
Our results show that our top-down unsupervised segmentation is robust to both object-centric and scene-centric datasets.
arXiv Detail & Related papers (2021-12-02T18:59:03Z) - Three Ways to Improve Semantic Segmentation with Self-Supervised Depth
Estimation [90.87105131054419]
We present a framework for semi-supervised semantic segmentation, which is enhanced by self-supervised monocular depth estimation from unlabeled image sequences.
We validate the proposed model on the Cityscapes dataset, where all three modules demonstrate significant performance gains.
arXiv Detail & Related papers (2020-12-19T21:18:03Z) - A Holistically-Guided Decoder for Deep Representation Learning with
Applications to Semantic Segmentation and Object Detection [74.88284082187462]
One common strategy is to adopt dilated convolutions in the backbone networks to extract high-resolution feature maps.
We propose one novel holistically-guided decoder which is introduced to obtain the high-resolution semantic-rich feature maps.
arXiv Detail & Related papers (2020-12-18T10:51:49Z) - Multi-scale Interactive Network for Salient Object Detection [91.43066633305662]
We propose the aggregate interaction modules to integrate the features from adjacent levels.
To obtain more efficient multi-scale features, the self-interaction modules are embedded in each decoder unit.
Experimental results on five benchmark datasets demonstrate that the proposed method without any post-processing performs favorably against 23 state-of-the-art approaches.
arXiv Detail & Related papers (2020-07-17T15:41:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.