A Large Scale Homography Benchmark
- URL: http://arxiv.org/abs/2302.09997v1
- Date: Mon, 20 Feb 2023 14:18:09 GMT
- Title: A Large Scale Homography Benchmark
- Authors: Daniel Barath, Dmytro Mishkin, Michal Polic, Wolfgang F\"orstner, Jiri
Matas
- Abstract summary: We present a large-scale dataset of Planes in 3D, Pi3D, of roughly 1000 planes observed in 10 000 images from the 1DSfM dataset.
We also present HEB, a large-scale homography estimation benchmark leveraging Pi3D.
- Score: 52.55694707744518
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present a large-scale dataset of Planes in 3D, Pi3D, of roughly 1000
planes observed in 10 000 images from the 1DSfM dataset, and HEB, a large-scale
homography estimation benchmark leveraging Pi3D. The applications of the Pi3D
dataset are diverse, e.g. training or evaluating monocular depth, surface
normal estimation and image matching algorithms. The HEB dataset consists of
226 260 homographies and includes roughly 4M correspondences. The homographies
link images that often undergo significant viewpoint and illumination changes.
As applications of HEB, we perform a rigorous evaluation of a wide range of
robust estimators and deep learning-based correspondence filtering methods,
establishing the current state-of-the-art in robust homography estimation. We
also evaluate the uncertainty of the SIFT orientations and scales w.r.t. the
ground truth coming from the underlying homographies and provide codes for
comparing uncertainty of custom detectors. The dataset is available at
\url{https://github.com/danini/homography-benchmark}.
Related papers
- Metric3Dv2: A Versatile Monocular Geometric Foundation Model for Zero-shot Metric Depth and Surface Normal Estimation [74.28509379811084]
Metric3D v2 is a geometric foundation model for zero-shot metric depth and surface normal estimation from a single image.
We propose solutions for both metric depth estimation and surface normal estimation.
Our method enables the accurate recovery of metric 3D structures on randomly collected internet images.
arXiv Detail & Related papers (2024-03-22T02:30:46Z) - Fine-Grained Cross-View Geo-Localization Using a Correlation-Aware
Homography Estimator [12.415973198004169]
We introduce a novel approach to fine-grained cross-view geo-localization.
Our method aligns a warped ground image with a corresponding GPS-tagged satellite image covering the same area.
operating at a speed of 30 FPS, our method outperforms state-of-the-art techniques.
arXiv Detail & Related papers (2023-08-31T17:59:24Z) - V-DETR: DETR with Vertex Relative Position Encoding for 3D Object
Detection [73.37781484123536]
We introduce a highly performant 3D object detector for point clouds using the DETR framework.
To address the limitation, we introduce a novel 3D Relative Position (3DV-RPE) method.
We show exceptional results on the challenging ScanNetV2 benchmark.
arXiv Detail & Related papers (2023-08-08T17:14:14Z) - SATR: Zero-Shot Semantic Segmentation of 3D Shapes [74.08209893396271]
We explore the task of zero-shot semantic segmentation of 3D shapes by using large-scale off-the-shelf 2D image recognition models.
We develop the Assignment with Topological Reweighting (SATR) algorithm and evaluate it on ShapeNetPart and our proposed FAUST benchmarks.
SATR achieves state-of-the-art performance and outperforms a baseline algorithm by 1.3% and 4% average mIoU.
arXiv Detail & Related papers (2023-04-11T00:43:16Z) - Contour Context: Abstract Structural Distribution for 3D LiDAR Loop
Detection and Metric Pose Estimation [31.968749056155467]
This paper proposes a simple, effective, and efficient topological loop closure detection pipeline with accurate 3-DoF metric pose estimation.
We interpret the Cartesian birds' eye view (BEV) image projected from 3D LiDAR points as layered distribution of structures.
A retrieval key is designed to accelerate the search of a database indexed by layered KD-trees.
arXiv Detail & Related papers (2023-02-13T07:18:24Z) - 360 Depth Estimation in the Wild -- The Depth360 Dataset and the SegFuse
Network [35.03201732370496]
Single-view depth estimation from omnidirectional images has gained popularity with its wide range of applications such as autonomous driving and scene reconstruction.
In this work, we first establish a large-scale dataset with varied settings called Depth360 to tackle the training data problem.
We then propose an end-to-end two-branch multi-task learning network, SegFuse, that mimics the human eye to effectively learn from the dataset.
arXiv Detail & Related papers (2022-02-16T11:56:31Z) - AI-supported Framework of Semi-Automatic Monoplotting for Monocular
Oblique Visual Data Analysis [0.0]
We propose and demonstrate a novel semi-automatic monoplotting framework that provides pixel-level correspondence between photos and Digital Elevation Model (DEM)
A pipeline of analyses was developed including key point detection in images and DEMs, retrieving georeferenced 3D DEMs, regularized pose estimation, gradient-based optimization, and the identification between image pixels and real world coordinates.
arXiv Detail & Related papers (2021-11-28T02:03:43Z) - Object Detection in Aerial Images: A Large-Scale Benchmark and
Challenges [124.48654341780431]
We present a large-scale dataset of Object deTection in Aerial images (DOTA) and comprehensive baselines for ODAI.
The proposed DOTA dataset contains 1,793,658 object instances of 18 categories of oriented-bounding-box annotations collected from 11,268 aerial images.
We build baselines covering 10 state-of-the-art algorithms with over 70 configurations, where the speed and accuracy performances of each model have been evaluated.
arXiv Detail & Related papers (2021-02-24T11:20:55Z) - MVHM: A Large-Scale Multi-View Hand Mesh Benchmark for Accurate 3D Hand
Pose Estimation [32.12879364117658]
Estimating 3D hand poses from a single RGB image is challenging because depth ambiguity leads the problem ill-posed.
We design a spin match algorithm that enables a rigid mesh model matching with any target mesh ground truth.
We present a multi-view hand pose estimation approach to verify that training a hand pose estimator with our generated dataset greatly enhances the performance.
arXiv Detail & Related papers (2020-12-06T07:55:08Z) - ETH-XGaze: A Large Scale Dataset for Gaze Estimation under Extreme Head
Pose and Gaze Variation [52.5465548207648]
ETH-XGaze is a new gaze estimation dataset consisting of over one million high-resolution images of varying gaze under extreme head poses.
We show that our dataset can significantly improve the robustness of gaze estimation methods across different head poses and gaze angles.
arXiv Detail & Related papers (2020-07-31T04:15:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.