Why-So-Deep: Towards Boosting Previously Trained Models for Visual Place
Recognition
- URL: http://arxiv.org/abs/2201.03212v1
- Date: Mon, 10 Jan 2022 08:39:06 GMT
- Title: Why-So-Deep: Towards Boosting Previously Trained Models for Visual Place
Recognition
- Authors: M. Usman Maqbool Bhutta, Yuxiang Sun, Darwin Lau, Ming Liu
- Abstract summary: We present an intelligent method, MAQBOOL, to amplify the power of pre-trained models for better image recall.
We achieve comparable image retrieval results at a low descriptor dimension (512-D), compared to the high descriptor dimension (4096-D) of state-of-the-art methods.
- Score: 12.807343105549409
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Deep learning-based image retrieval techniques for the loop closure detection
demonstrate satisfactory performance. However, it is still challenging to
achieve high-level performance based on previously trained models in different
geographical regions. This paper addresses the problem of their deployment with
simultaneous localization and mapping (SLAM) systems in the new environment.
The general baseline approach uses additional information, such as GPS,
sequential keyframes tracking, and re-training the whole environment to enhance
the recall rate. We propose a novel approach for improving image retrieval
based on previously trained models. We present an intelligent method, MAQBOOL,
to amplify the power of pre-trained models for better image recall and its
application to real-time multiagent SLAM systems. We achieve comparable image
retrieval results at a low descriptor dimension (512-D), compared to the high
descriptor dimension (4096-D) of state-of-the-art methods. We use spatial
information to improve the recall rate in image retrieval on pre-trained
models.
Related papers
- VHS: High-Resolution Iterative Stereo Matching with Visual Hull Priors [3.523208537466128]
We present a stereo-matching method for depth estimation from high-resolution images using visual hulls as priors.
Our method uses object masks extracted from supplementary views of the scene to guide the disparity estimation, effectively reducing the search space for matches.
This approach is specifically tailored to stereo rigs in volumetric capture systems, where an accurate depth plays a key role in the downstream reconstruction task.
arXiv Detail & Related papers (2024-06-04T17:59:57Z) - PixMIM: Rethinking Pixel Reconstruction in Masked Image Modeling [83.67628239775878]
Masked Image Modeling (MIM) has achieved promising progress with the advent of Masked Autoencoders (MAE) and BEiT.
This paper undertakes a fundamental analysis of MIM from the perspective of pixel reconstruction.
We propose a remarkably simple and effective method, ourmethod, that entails two strategies.
arXiv Detail & Related papers (2023-03-04T13:38:51Z) - An Empirical Analysis of Recurrent Learning Algorithms In Neural Lossy
Image Compression Systems [73.48927855855219]
Recent advances in deep learning have resulted in image compression algorithms that outperform JPEG and JPEG 2000 on the standard Kodak benchmark.
In this paper, we perform the first large-scale comparison of recent state-of-the-art hybrid neural compression algorithms.
arXiv Detail & Related papers (2022-01-27T19:47:51Z) - Cross-Modal Retrieval Augmentation for Multi-Modal Classification [61.5253261560224]
We explore the use of unstructured external knowledge sources of images and their corresponding captions for improving visual question answering.
First, we train a novel alignment model for embedding images and captions in the same space, which achieves substantial improvement on image-caption retrieval.
Second, we show that retrieval-augmented multi-modal transformers using the trained alignment model improve results on VQA over strong baselines.
arXiv Detail & Related papers (2021-04-16T13:27:45Z) - Unifying Remote Sensing Image Retrieval and Classification with Robust
Fine-tuning [3.6526118822907594]
We aim at unifying remote sensing image retrieval and classification with a new large-scale training and testing dataset, SF300.
We show that our framework systematically achieves a boost of retrieval and classification performance on nine different datasets compared to an ImageNet pretrained baseline.
arXiv Detail & Related papers (2021-02-26T11:01:30Z) - An application of a pseudo-parabolic modeling to texture image
recognition [0.0]
We present a novel methodology for texture image recognition using a partial differential equation modeling.
We employ the pseudo-parabolic Buckley-Leverett equation to provide a dynamics to the digital image representation and collect local descriptors from those images evolving in time.
arXiv Detail & Related papers (2021-02-09T18:08:42Z) - Memory-Augmented Reinforcement Learning for Image-Goal Navigation [67.3963444878746]
We present a novel method that leverages a cross-episode memory to learn to navigate.
In order to avoid overfitting, we propose to use data augmentation on the RGB input during training.
We obtain this competitive performance from RGB input only, without access to additional sensors such as position or depth.
arXiv Detail & Related papers (2021-01-13T16:30:20Z) - Sparse Signal Models for Data Augmentation in Deep Learning ATR [0.8999056386710496]
We propose a data augmentation approach to incorporate domain knowledge and improve the generalization power of a data-intensive learning algorithm.
We exploit the sparsity of the scattering centers in the spatial domain and the smoothly-varying structure of the scattering coefficients in the azimuthal domain to solve the ill-posed problem of over-parametrized model fitting.
arXiv Detail & Related papers (2020-12-16T21:46:33Z) - A Plug-and-play Scheme to Adapt Image Saliency Deep Model for Video Data [54.198279280967185]
This paper proposes a novel plug-and-play scheme to weakly retrain a pretrained image saliency deep model for video data.
Our method is simple yet effective for adapting any off-the-shelf pre-trained image saliency deep model to obtain high-quality video saliency detection.
arXiv Detail & Related papers (2020-08-02T13:23:14Z) - Learning Deformable Image Registration from Optimization: Perspective,
Modules, Bilevel Training and Beyond [62.730497582218284]
We develop a new deep learning based framework to optimize a diffeomorphic model via multi-scale propagation.
We conduct two groups of image registration experiments on 3D volume datasets including image-to-atlas registration on brain MRI data and image-to-image registration on liver CT data.
arXiv Detail & Related papers (2020-04-30T03:23:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.