PoserNet: Refining Relative Camera Poses Exploiting Object Detections
- URL: http://arxiv.org/abs/2207.09445v2
- Date: Thu, 21 Jul 2022 08:18:59 GMT
- Title: PoserNet: Refining Relative Camera Poses Exploiting Object Detections
- Authors: Matteo Taiana, Matteo Toso, Stuart James, Alessio Del Bue
- Abstract summary: We use objectness regions to guide the pose estimation problem rather than explicit semantic object detections.
We propose Pose Refiner Network (PoserNet) a light-weight Graph Network to refine the approximate pair-wise relative camera poses.
We evaluate on the 7-Scenes dataset across varied sizes of graphs and show how this process can be beneficial to optimisation-based Motion Averaging algorithms.
- Score: 14.611595909419297
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The estimation of the camera poses associated with a set of images commonly
relies on feature matches between the images. In contrast, we are the first to
address this challenge by using objectness regions to guide the pose estimation
problem rather than explicit semantic object detections. We propose Pose
Refiner Network (PoserNet) a light-weight Graph Neural Network to refine the
approximate pair-wise relative camera poses. PoserNet exploits associations
between the objectness regions - concisely expressed as bounding boxes - across
multiple views to globally refine sparsely connected view graphs. We evaluate
on the 7-Scenes dataset across varied sizes of graphs and show how this process
can be beneficial to optimisation-based Motion Averaging algorithms improving
the median error on the rotation by 62 degrees with respect to the initial
estimates obtained based on bounding boxes. Code and data are available at
https://github.com/IIT-PAVIS/PoserNet.
Related papers
- GOReloc: Graph-based Object-Level Relocalization for Visual SLAM [17.608119427712236]
This article introduces a novel method for object-level relocalization of robotic systems.
It determines the pose of a camera sensor by robustly associating the object detections in the current frame with 3D objects in a lightweight object-level map.
arXiv Detail & Related papers (2024-08-15T03:54:33Z) - PRAGO: Differentiable Multi-View Pose Optimization From Objectness Detections [19.211193336526346]
We propose a Pose-refined Rotation Averaging Graph Optimization (PRAGO) method for differentiable estimating camera poses from a set of images.
Our method reconstructs the rotational pose, and in turn, the absolute pose, in a differentiable manner benefiting from the optimization of a sequence of geometrical tasks.
We show that PRAGO is able to outperform non-differentiable solvers on small and sparse scenes extracted from 7-Scenes achieving a relative improvement of 21% for rotations while achieving similar translation estimates.
arXiv Detail & Related papers (2024-03-13T14:42:55Z) - RelPose++: Recovering 6D Poses from Sparse-view Observations [66.6922660401558]
We address the task of estimating 6D camera poses from sparse-view image sets (2-8 images)
We build on the recent RelPose framework which learns a network that infers distributions over relative rotations over image pairs.
Our final system results in large improvements in 6D pose prediction over prior art on both seen and unseen object categories.
arXiv Detail & Related papers (2023-05-08T17:59:58Z) - PoseMatcher: One-shot 6D Object Pose Estimation by Deep Feature Matching [51.142988196855484]
We propose PoseMatcher, an accurate model free one-shot object pose estimator.
We create a new training pipeline for object to image matching based on a three-view system.
To enable PoseMatcher to attend to distinct input modalities, an image and a pointcloud, we introduce IO-Layer.
arXiv Detail & Related papers (2023-04-03T21:14:59Z) - Rigidity-Aware Detection for 6D Object Pose Estimation [60.88857851869196]
Most recent 6D object pose estimation methods first use object detection to obtain 2D bounding boxes before actually regressing the pose.
We propose a rigidity-aware detection method exploiting the fact that, in 6D pose estimation, the target objects are rigid.
Key to the success of our approach is a visibility map, which we propose to build using a minimum barrier distance between every pixel in the bounding box and the box boundary.
arXiv Detail & Related papers (2023-03-22T09:02:54Z) - High-resolution Iterative Feedback Network for Camouflaged Object
Detection [128.893782016078]
Spotting camouflaged objects that are visually assimilated into the background is tricky for object detection algorithms.
We aim to extract the high-resolution texture details to avoid the detail degradation that causes blurred vision in edges and boundaries.
We introduce a novel HitNet to refine the low-resolution representations by high-resolution features in an iterative feedback manner.
arXiv Detail & Related papers (2022-03-22T11:20:21Z) - Relation Regularized Scene Graph Generation [206.76762860019065]
Scene graph generation (SGG) is built on top of detected objects to predict object pairwise visual relations.
We propose a relation regularized network (R2-Net) which can predict whether there is a relationship between two objects.
Our R2-Net can effectively refine object labels and generate scene graphs.
arXiv Detail & Related papers (2022-02-22T11:36:49Z) - Epipolar-Guided Deep Object Matching for Scene Change Detection [23.951526610952765]
This paper describes a viewpoint-robust object-based change detection network (OBJ-CDNet)
Mobile cameras capture images from different viewpoints each time due to differences in camera trajectory and shutter timing.
We introduce a deep graph matching network that establishes object correspondence between an image pair.
arXiv Detail & Related papers (2020-07-30T15:48:40Z) - GeoGraph: Learning graph-based multi-view object detection with
geometric cues end-to-end [10.349116753411742]
We propose an end-to-end learnable approach that detects static urban objects from multiple views.
Our method relies on a Graph Neural Network (GNN) to, detect all objects and output their geographic positions.
Our GNN simultaneously models relative pose and image evidence, and is further able to deal with an arbitrary number of input views.
arXiv Detail & Related papers (2020-03-23T09:40:35Z) - High-Order Information Matters: Learning Relation and Topology for
Occluded Person Re-Identification [84.43394420267794]
We propose a novel framework by learning high-order relation and topology information for discriminative features and robust alignment.
Our framework significantly outperforms state-of-the-art by6.5%mAP scores on Occluded-Duke dataset.
arXiv Detail & Related papers (2020-03-18T12:18:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.