$\mathbb{X}$Resolution Correspondence Networks
- URL: http://arxiv.org/abs/2012.09842v2
- Date: Wed, 24 Mar 2021 17:27:37 GMT
- Title: $\mathbb{X}$Resolution Correspondence Networks
- Authors: Georgi Tinchev, Shuda Li, Kai Han, David Mitchell, Rigas Kouskouridas
- Abstract summary: In this paper, we aim at establishing accurate dense correspondences between a pair of images with overlapping field of view under challenging illumination variation, viewpoint changes, and style differences.
- Score: 15.214155342197474
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we aim at establishing accurate dense correspondences between
a pair of images with overlapping field of view under challenging illumination
variation, viewpoint changes, and style differences. Through an extensive
ablation study of the state-of-the-art correspondence networks, we surprisingly
discovered that the widely adopted 4D correlation tensor and its related
learning and processing modules could be de-parameterised and removed from
training with merely a minor impact over the final matching accuracy. Disabling
these computational expensive modules dramatically speeds up the training
procedure and allows to use 4 times bigger batch size, which in turn
compensates for the accuracy drop. Together with a multi-GPU inference stage,
our method facilitates the systematic investigation of the relationship between
matching accuracy and up-sampling resolution of the native testing images from
1280 to 4K. This leads to discovery of the existence of an optimal resolution
$\mathbb{X}$ that produces accurate matching performance surpassing the
state-of-the-art methods particularly over the lower error band on public
benchmarks for the proposed network.
Related papers
- Forgery-aware Adaptive Transformer for Generalizable Synthetic Image
Detection [106.39544368711427]
We study the problem of generalizable synthetic image detection, aiming to detect forgery images from diverse generative methods.
We present a novel forgery-aware adaptive transformer approach, namely FatFormer.
Our approach tuned on 4-class ProGAN data attains an average of 98% accuracy to unseen GANs, and surprisingly generalizes to unseen diffusion models with 95% accuracy.
arXiv Detail & Related papers (2023-12-27T17:36:32Z) - CalibFormer: A Transformer-based Automatic LiDAR-Camera Calibration Network [11.602943913324653]
CalibFormer is an end-to-end network for automatic LiDAR-camera calibration.
We aggregate multiple layers of camera and LiDAR image features to achieve high-resolution representations.
Our method achieved a mean translation error of $0.8751 mathrmcm$ and a mean rotation error of $0.0562 circ$ on the KITTI dataset.
arXiv Detail & Related papers (2023-11-26T08:59:30Z) - ScaleCrafter: Tuning-free Higher-Resolution Visual Generation with
Diffusion Models [126.35334860896373]
We investigate the capability of generating images from pre-trained diffusion models at much higher resolutions than the training image sizes.
Existing works for higher-resolution generation, such as attention-based and joint-diffusion approaches, cannot well address these issues.
We propose a simple yet effective re-dilation that can dynamically adjust the convolutional perception field during inference.
arXiv Detail & Related papers (2023-10-11T17:52:39Z) - MMNet: Multi-Collaboration and Multi-Supervision Network for Sequential
Deepfake Detection [81.59191603867586]
Sequential deepfake detection aims to identify forged facial regions with the correct sequence for recovery.
The recovery of forged images requires knowledge of the manipulation model to implement inverse transformations.
We propose Multi-Collaboration and Multi-Supervision Network (MMNet) that handles various spatial scales and sequential permutations in forged face images.
arXiv Detail & Related papers (2023-07-06T02:32:08Z) - Breaking Modality Disparity: Harmonized Representation for Infrared and
Visible Image Registration [66.33746403815283]
We propose a scene-adaptive infrared and visible image registration.
We employ homography to simulate the deformation between different planes.
We propose the first ground truth available misaligned infrared and visible image dataset.
arXiv Detail & Related papers (2023-04-12T06:49:56Z) - LocalTrans: A Multiscale Local Transformer Network for Cross-Resolution
Homography Estimation [52.63874513999119]
Cross-resolution image alignment is a key problem in multiscale giga photography.
Existing deep homography methods neglecting the explicit formulation of correspondences between them, which leads to degraded accuracy in cross-resolution challenges.
We propose a local transformer network embedded within a multiscale structure to explicitly learn correspondences between the multimodal inputs.
arXiv Detail & Related papers (2021-06-08T02:51:45Z) - Dynamic Resolution Network [40.64164953983429]
The redundancy on the input resolution of modern CNNs has not been fully investigated.
We propose a novel dynamic-resolution network (DRNet) in which the resolution is determined dynamically based on each input sample.
DRNet achieves similar performance with an about 34% reduction, while gains 1.4% accuracy increase with 10% reduction compared to the original ResNet-50 on ImageNet.
arXiv Detail & Related papers (2021-06-05T13:48:33Z) - Full Matching on Low Resolution for Disparity Estimation [84.45201205560431]
A Multistage Full Matching disparity estimation scheme (MFM) is proposed in this work.
We demonstrate that decouple all similarity scores directly from the low-resolution 4D volume step by step instead of estimating low-resolution 3D cost volume.
Experiment results demonstrate that the proposed method achieves more accurate disparity estimation results and outperforms state-of-the-art methods on Scene Flow, KITTI 2012 and KITTI 2015 datasets.
arXiv Detail & Related papers (2020-12-10T11:11:23Z) - Resolution Switchable Networks for Runtime Efficient Image Recognition [46.09537029831355]
We propose a general method to train a single convolutional neural network which is capable of switching image resolutions at inference.
Networks trained with the proposed method are named Resolution Switchable Networks (RS-Nets)
arXiv Detail & Related papers (2020-07-19T02:12:59Z) - Dual-Resolution Correspondence Networks [20.004691262722265]
We introduce Dual-Resolution Correspondence Networks (DualRC-Net), to obtain pixel-wise correspondences in a coarse-to-fine manner.
We evaluate our method on large-scale public benchmarks including HPatches, InLoc, and Aachen Day-Night.
arXiv Detail & Related papers (2020-06-16T00:42:43Z) - Efficient Neighbourhood Consensus Networks via Submanifold Sparse
Convolutions [41.43309123350792]
We adopt the recent Neighbourhood Consensus Networks that have demonstrated promising performance for difficult correspondence problems.
We propose modifications to overcome their main limitations: large memory consumption, large inference time and poorly localised correspondences.
Our proposed modifications can reduce the memory footprint and execution time more than $10times$, with equivalent results.
arXiv Detail & Related papers (2020-04-22T13:37:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.