Unsupervised Homography Estimation with Coplanarity-Aware GAN
- URL: http://arxiv.org/abs/2205.03821v1
- Date: Sun, 8 May 2022 09:26:47 GMT
- Title: Unsupervised Homography Estimation with Coplanarity-Aware GAN
- Authors: Mingbo Hong, Yuhang Lu, Nianjin Ye, Chunyu Lin, Qijun Zhao, Shuaicheng
Liu
- Abstract summary: Estimating homography from an image pair is a fundamental problem in image alignment.
HomoGAN is designed to guide unsupervised homography estimation to focus on the dominant plane.
Results show that our matching error is 22% lower than the previous SOTA method.
- Score: 39.477228263736905
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Estimating homography from an image pair is a fundamental problem in image
alignment. Unsupervised learning methods have received increasing attention in
this field due to their promising performance and label-free training. However,
existing methods do not explicitly consider the problem of plane-induced
parallax, which will make the predicted homography compromised on multiple
planes. In this work, we propose a novel method HomoGAN to guide unsupervised
homography estimation to focus on the dominant plane. First, a multi-scale
transformer network is designed to predict homography from the feature pyramids
of input images in a coarse-to-fine fashion. Moreover, we propose an
unsupervised GAN to impose coplanarity constraint on the predicted homography,
which is realized by using a generator to predict a mask of aligned regions,
and then a discriminator to check if two masked feature maps are induced by a
single homography. To validate the effectiveness of HomoGAN and its components,
we conduct extensive experiments on a large-scale dataset, and the results show
that our matching error is 22% lower than the previous SOTA method. Code is
available at https://github.com/megvii-research/HomoGAN.
Related papers
- DiAD: A Diffusion-based Framework for Multi-class Anomaly Detection [55.48770333927732]
We propose a Difusion-based Anomaly Detection (DiAD) framework for multi-class anomaly detection.
It consists of a pixel-space autoencoder, a latent-space Semantic-Guided (SG) network with a connection to the stable diffusion's denoising network, and a feature-space pre-trained feature extractor.
Experiments on MVTec-AD and VisA datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-12-11T18:38:28Z) - AMAE: Adaptation of Pre-Trained Masked Autoencoder for Dual-Distribution
Anomaly Detection in Chest X-Rays [17.91123470181453]
We propose AMAE, a two-stage algorithm for adaptation of the pre-trained masked autoencoder (MAE)
AMAE leads to consistent performance gains over competing self-supervised and dual distribution anomaly detection methods.
arXiv Detail & Related papers (2023-07-24T12:03:50Z) - APRF: Anti-Aliasing Projection Representation Field for Inverse Problem
in Imaging [74.9262846410559]
Sparse-view Computed Tomography (SVCT) reconstruction is an ill-posed inverse problem in imaging.
Recent works use Implicit Neural Representations (INRs) to build the coordinate-based mapping between sinograms and CT images.
We propose a self-supervised SVCT reconstruction method -- Anti-Aliasing Projection Representation Field (APRF)
APRF can build the continuous representation between adjacent projection views via the spatial constraints.
arXiv Detail & Related papers (2023-07-11T14:04:12Z) - Semi-supervised Deep Large-baseline Homography Estimation with
Progressive Equivalence Constraint [25.022907946911033]
Homography estimation is erroneous in the case of large-baseline due to the low image overlay and limited receptive field.
We propose a progressive estimation strategy by converting large-baseline homography into multiple intermediate ones.
Our method achieves state-of-the-art performance in large-baseline scenes while keeping competitive performance in small-baseline scenes.
arXiv Detail & Related papers (2022-12-06T05:28:05Z) - Ground Plane Matters: Picking Up Ground Plane Prior in Monocular 3D
Object Detection [92.75961303269548]
The ground plane prior is a very informative geometry clue in monocular 3D object detection (M3OD)
We propose a Ground Plane Enhanced Network (GPENet) which resolves both issues at one go.
Our GPENet can outperform other methods and achieve state-of-the-art performance, well demonstrating the effectiveness and the superiority of the proposed approach.
arXiv Detail & Related papers (2022-11-03T02:21:35Z) - CAMERAS: Enhanced Resolution And Sanity preserving Class Activation
Mapping for image saliency [61.40511574314069]
Backpropagation image saliency aims at explaining model predictions by estimating model-centric importance of individual pixels in the input.
We propose CAMERAS, a technique to compute high-fidelity backpropagation saliency maps without requiring any external priors.
arXiv Detail & Related papers (2021-06-20T08:20:56Z) - Scalable Semi-supervised Landmark Localization for X-ray Images using
Few-shot Deep Adaptive Graph [19.588348005574165]
Based on a fully-supervised graph-based method, DAG, we proposed a semi-supervised extension of it, termed few-shot DAG.
It first trains a DAG model on the labeled data and then fine-tunes the pre-trained model on the unlabeled data with a teacher-student SSL mechanism.
We extensively evaluated our method on pelvis, hand and chest landmark detection tasks.
arXiv Detail & Related papers (2021-04-29T19:46:18Z) - Perceptual Loss for Robust Unsupervised Homography Estimation [1.2891210250935146]
BiHomE minimizes the distance in the feature space between the warped image from the source viewpoint and the corresponding image from the target viewpoint.
We show that biHomE achieves state-of-the-art performance on synthetic COCO dataset, which is also comparable or better compared to supervised approaches.
arXiv Detail & Related papers (2021-04-20T14:41:54Z) - Non-Homogeneous Haze Removal via Artificial Scene Prior and
Bidimensional Graph Reasoning [52.07698484363237]
We propose a Non-Homogeneous Haze Removal Network (NHRN) via artificial scene prior and bidimensional graph reasoning.
Our method achieves superior performance over many state-of-the-art algorithms for both the single image dehazing and hazy image understanding tasks.
arXiv Detail & Related papers (2021-04-05T13:04:44Z) - Motion Basis Learning for Unsupervised Deep Homography Estimation with
Subspace Projection [27.68752841842823]
We introduce a new framework for unsupervised deep homography estimation.
We show that our approach outperforms the state-of-the-art on the homography benchmark datasets.
arXiv Detail & Related papers (2021-03-29T05:51:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.