Supervised Fine-tuning Evaluation for Long-term Visual Place Recognition
- URL: http://arxiv.org/abs/2211.07696v1
- Date: Mon, 14 Nov 2022 19:16:21 GMT
- Title: Supervised Fine-tuning Evaluation for Long-term Visual Place Recognition
- Authors: Farid Alijani and Esa Rahtu
- Abstract summary: We present a comprehensive study on the utility of deep convolutional neural networks with two state-of-the-art pooling layers.
We compare deep learned global features with three different loss functions, e.g. triplet, contrastive and ArcFace, for learning the parameters of the architectures.
Our investigation demonstrates that fine tuning architectures with ArcFace loss in an end-to-end manner outperforms other two losses by approximately 14% in outdoor and 12% in indoor datasets.
- Score: 14.632777952261716
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we present a comprehensive study on the utility of deep
convolutional neural networks with two state-of-the-art pooling layers which
are placed after convolutional layers and fine-tuned in an end-to-end manner
for visual place recognition task in challenging conditions, including seasonal
and illumination variations. We compared extensively the performance of deep
learned global features with three different loss functions, e.g. triplet,
contrastive and ArcFace, for learning the parameters of the architectures in
terms of fraction of the correct matches during deployment. To verify
effectiveness of our results, we utilized two real world datasets in place
recognition, both indoor and outdoor. Our investigation demonstrates that fine
tuning architectures with ArcFace loss in an end-to-end manner outperforms
other two losses by approximately 1~4% in outdoor and 1~2% in indoor datasets,
given certain thresholds, for the visual place recognition tasks.
Related papers
- A Deep Learning Architecture for Land Cover Mapping Using Spatio-Temporal Sentinel-1 Features [1.907072234794597]
The study focuses on three distinct regions - Amazonia, Africa, and Siberia - and evaluates the model performance across diverse ecoregions within these areas.
The results demonstrate the effectiveness and the capabilities of the proposed methodology in achieving overall accuracy (O.A.) values, even in regions with limited training data.
arXiv Detail & Related papers (2025-03-10T12:15:35Z) - DepthLab: From Partial to Complete [80.58276388743306]
Missing values remain a common challenge for depth data across its wide range of applications.
This work bridges this gap with DepthLab, a foundation depth inpainting model powered by image diffusion priors.
Our approach proves its worth in various downstream tasks, including 3D scene inpainting, text-to-3D scene generation, sparse-view reconstruction with DUST3R, and LiDAR depth completion.
arXiv Detail & Related papers (2024-12-24T04:16:38Z) - Towards Robust Out-of-Distribution Generalization: Data Augmentation and Neural Architecture Search Approaches [4.577842191730992]
We study ways toward robust OoD generalization for deep learning.
We first propose a novel and effective approach to disentangle the spurious correlation between features that are not essential for recognition.
We then study the problem of strengthening neural architecture search in OoD scenarios.
arXiv Detail & Related papers (2024-10-25T20:50:32Z) - Hierarchical localization with panoramic views and triplet loss functions [2.663377882489275]
The main objective of this paper is to tackle visual localization, which is essential for the safe navigation of mobile robots.
The solution we propose employs panoramic images and triplet convolutional neural networks.
To explore the limits of our approach, triplet networks have been tested in different indoor environments simultaneously.
arXiv Detail & Related papers (2024-04-22T12:07:10Z) - RadOcc: Learning Cross-Modality Occupancy Knowledge through Rendering
Assisted Distillation [50.35403070279804]
3D occupancy prediction is an emerging task that aims to estimate the occupancy states and semantics of 3D scenes using multi-view images.
We propose RadOcc, a Rendering assisted distillation paradigm for 3D Occupancy prediction.
arXiv Detail & Related papers (2023-12-19T03:39:56Z) - Leveraging Neural Radiance Fields for Uncertainty-Aware Visual
Localization [56.95046107046027]
We propose to leverage Neural Radiance Fields (NeRF) to generate training samples for scene coordinate regression.
Despite NeRF's efficiency in rendering, many of the rendered data are polluted by artifacts or only contain minimal information gain.
arXiv Detail & Related papers (2023-10-10T20:11:13Z) - Optimization-Based Separations for Neural Networks [57.875347246373956]
We show that gradient descent can efficiently learn ball indicator functions using a depth 2 neural network with two layers of sigmoidal activations.
This is the first optimization-based separation result where the approximation benefits of the stronger architecture provably manifest in practice.
arXiv Detail & Related papers (2021-12-04T18:07:47Z) - Unsupervised Scale-consistent Depth Learning from Video [131.3074342883371]
We propose a monocular depth estimator SC-Depth, which requires only unlabelled videos for training.
Thanks to the capability of scale-consistent prediction, we show that our monocular-trained deep networks are readily integrated into the ORB-SLAM2 system.
The proposed hybrid Pseudo-RGBD SLAM shows compelling results in KITTI, and it generalizes well to the KAIST dataset without additional training.
arXiv Detail & Related papers (2021-05-25T02:17:56Z) - InverseForm: A Loss Function for Structured Boundary-Aware Segmentation [80.39674800972182]
We present a novel boundary-aware loss term for semantic segmentation using an inverse-transformation network.
This plug-in loss term complements the cross-entropy loss in capturing boundary transformations.
We analyze the quantitative and qualitative effects of our loss function on three indoor and outdoor segmentation benchmarks.
arXiv Detail & Related papers (2021-04-06T18:52:45Z) - Early Bird: Loop Closures from Opposing Viewpoints for
Perceptually-Aliased Indoor Environments [35.663671249819124]
We present novel research that simultaneously addresses viewpoint change and perceptual aliasing.
We show that our integration of VPR with SLAM significantly boosts the performance of VPR, feature correspondence, and pose graph submodules.
For the first time, we demonstrate a localization system capable of state-of-the-art performance despite perceptual aliasing and extreme 180-degree-rotated viewpoint change.
arXiv Detail & Related papers (2020-10-03T20:18:55Z) - On estimating gaze by self-attention augmented convolutions [6.015556590955813]
We propose a novel network architecture grounded on self-attention augmented convolutions to improve the quality of the learned features.
We dubbed our framework ARes-gaze, which explores our Attention-augmented ResNet (ARes-14) as twin convolutional backbones.
Results showed a decrease of the average angular error by 2.38% when compared to state-of-the-art methods on the MPIIFaceGaze data set, and a second-place on the EyeDiap data set.
arXiv Detail & Related papers (2020-08-25T14:29:05Z) - Campus3D: A Photogrammetry Point Cloud Benchmark for Hierarchical
Understanding of Outdoor Scene [76.4183572058063]
We present a richly-annotated 3D point cloud dataset for multiple outdoor scene understanding tasks.
The dataset has been point-wisely annotated with both hierarchical and instance-based labels.
We formulate a hierarchical learning problem for 3D point cloud segmentation and propose a measurement evaluating consistency across various hierarchies.
arXiv Detail & Related papers (2020-08-11T19:10:32Z) - Learning Robust Feature Representations for Scene Text Detection [0.0]
We present a network architecture derived from the loss to maximize conditional log-likelihood.
By extending the layer of latent variables to multiple layers, the network is able to learn robust features on scale.
In experiments, the proposed algorithm significantly outperforms state-of-the-art methods in terms of both recall and precision.
arXiv Detail & Related papers (2020-05-26T01:06:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.