Fusing Convolutional Neural Network and Geometric Constraint for
Image-based Indoor Localization
- URL: http://arxiv.org/abs/2201.01408v1
- Date: Wed, 5 Jan 2022 02:04:41 GMT
- Title: Fusing Convolutional Neural Network and Geometric Constraint for
Image-based Indoor Localization
- Authors: Jingwei Song, Mitesh Patel, and Maani Ghaffari
- Abstract summary: This paper proposes a new image-based localization framework that explicitly localizes the camera/robot.
The camera is localized using a single or few observed images and training images with 6-degree-of-freedom pose labels.
Experiments on simulation and real data sets demonstrate the efficiency of our proposed method.
- Score: 4.071875179293035
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper proposes a new image-based localization framework that explicitly
localizes the camera/robot by fusing Convolutional Neural Network (CNN) and
sequential images' geometric constraints. The camera is localized using a
single or few observed images and training images with 6-degree-of-freedom pose
labels. A Siamese network structure is adopted to train an image descriptor
network, and the visually similar candidate image in the training set is
retrieved to localize the testing image geometrically. Meanwhile, a
probabilistic motion model predicts the pose based on a constant velocity
assumption. The two estimated poses are finally fused using their uncertainties
to yield an accurate pose prediction. This method leverages the geometric
uncertainty and is applicable in indoor scenarios predominated by diffuse
illumination. Experiments on simulation and real data sets demonstrate the
efficiency of our proposed method. The results further show that combining the
CNN-based framework with geometric constraint achieves better accuracy when
compared with CNN-only methods, especially when the training data size is
small.
Related papers
- Learning Robust Multi-Scale Representation for Neural Radiance Fields
from Unposed Images [65.41966114373373]
We present an improved solution to the neural image-based rendering problem in computer vision.
The proposed approach could synthesize a realistic image of the scene from a novel viewpoint at test time.
arXiv Detail & Related papers (2023-11-08T08:18:23Z) - Decoupled Mixup for Generalized Visual Recognition [71.13734761715472]
We propose a novel "Decoupled-Mixup" method to train CNN models for visual recognition.
Our method decouples each image into discriminative and noise-prone regions, and then heterogeneously combines these regions to train CNN models.
Experiment results show the high generalization performance of our method on testing data that are composed of unseen contexts.
arXiv Detail & Related papers (2022-10-26T15:21:39Z) - ImPosIng: Implicit Pose Encoding for Efficient Camera Pose Estimation [2.6808541153140077]
Implicit Pose.
(ImPosing) embeds images and camera poses into a common latent representation with 2 separate neural networks.
By evaluating candidates through the latent space in a hierarchical manner, the camera position and orientation are not directly regressed but refined.
arXiv Detail & Related papers (2022-05-05T13:33:25Z) - OSLO: On-the-Sphere Learning for Omnidirectional images and its
application to 360-degree image compression [59.58879331876508]
We study the learning of representation models for omnidirectional images and propose to use the properties of HEALPix uniform sampling of the sphere to redefine the mathematical tools used in deep learning models for omnidirectional images.
Our proposed on-the-sphere solution leads to a better compression gain that can save 13.7% of the bit rate compared to similar learned models applied to equirectangular images.
arXiv Detail & Related papers (2021-07-19T22:14:30Z) - Visual Camera Re-Localization Using Graph Neural Networks and Relative
Pose Supervision [31.947525258453584]
Visual re-localization means using a single image as input to estimate the camera's location and orientation relative to a pre-recorded environment.
Our proposed method makes few special assumptions, and is fairly lightweight in training and testing.
We validate the effectiveness of our approach on both standard indoor (7-Scenes) and outdoor (Cambridge Landmarks) camera re-localization benchmarks.
arXiv Detail & Related papers (2021-04-06T14:29:03Z) - Transformer Guided Geometry Model for Flow-Based Unsupervised Visual
Odometry [38.20137500372927]
We propose a method consisting of two camera pose estimators that deal with the information from pairwise images.
For image sequences, a Transformer-like structure is adopted to build a geometry model over a local temporal window.
A Flow-to-Flow Pose Estimator (F2FPE) is proposed to exploit the relationship between pairwise images.
arXiv Detail & Related papers (2020-12-08T19:39:26Z) - Unsupervised Metric Relocalization Using Transform Consistency Loss [66.19479868638925]
Training networks to perform metric relocalization traditionally requires accurate image correspondences.
We propose a self-supervised solution, which exploits a key insight: localizing a query image within a map should yield the same absolute pose, regardless of the reference image used for registration.
We evaluate our framework on synthetic and real-world data, showing our approach outperforms other supervised methods when a limited amount of ground-truth information is available.
arXiv Detail & Related papers (2020-11-01T19:24:27Z) - Category Level Object Pose Estimation via Neural Analysis-by-Synthesis [64.14028598360741]
In this paper we combine a gradient-based fitting procedure with a parametric neural image synthesis module.
The image synthesis network is designed to efficiently span the pose configuration space.
We experimentally show that the method can recover orientation of objects with high accuracy from 2D images alone.
arXiv Detail & Related papers (2020-08-18T20:30:47Z) - Neural Geometric Parser for Single Image Camera Calibration [17.393543270903653]
We propose a neural geometric learning single image camera calibration for man-made scenes.
Our approach considers both semantic and geometric cues, resulting in significant accuracy improvement.
The experimental results reveal that the performance of our neural approach is significantly higher than that of existing state-of-the-art camera calibration techniques.
arXiv Detail & Related papers (2020-07-23T08:29:00Z) - Verification of Deep Convolutional Neural Networks Using ImageStars [10.44732293654293]
Convolutional Neural Networks (CNN) have redefined the state-of-the-art in many real-world applications.
CNNs are vulnerable to adversarial attacks, where slight changes to their inputs may lead to sharp changes in their output.
We describe a set-based framework that successfully deals with real-world CNNs, such as VGG16 and VGG19, that have high accuracy on ImageNet.
arXiv Detail & Related papers (2020-04-12T00:37:21Z) - 6D Camera Relocalization in Ambiguous Scenes via Continuous Multimodal
Inference [67.70859730448473]
We present a multimodal camera relocalization framework that captures ambiguities and uncertainties.
We predict multiple camera pose hypotheses as well as the respective uncertainty for each prediction.
We introduce a new dataset specifically designed to foster camera localization research in ambiguous environments.
arXiv Detail & Related papers (2020-04-09T20:55:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.