Learning Local Implicit Fourier Representation for Image Warping
- URL: http://arxiv.org/abs/2207.01831v1
- Date: Tue, 5 Jul 2022 06:30:17 GMT
- Title: Learning Local Implicit Fourier Representation for Image Warping
- Authors: Jaewon Lee, Kwang Pyo Choi, Kyong Hwan Jin
- Abstract summary: We propose a local texture estimator for image warping (LTEW) followed by an implicit neural representation to deform images into continuous shapes.
Our LTEW-based neural function outperforms existing warping methods for asymmetric-scale SR and homography transform.
- Score: 11.526109213908091
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Image warping aims to reshape images defined on rectangular grids into
arbitrary shapes. Recently, implicit neural functions have shown remarkable
performances in representing images in a continuous manner. However, a
standalone multi-layer perceptron suffers from learning high-frequency Fourier
coefficients. In this paper, we propose a local texture estimator for image
warping (LTEW) followed by an implicit neural representation to deform images
into continuous shapes. Local textures estimated from a deep super-resolution
(SR) backbone are multiplied by locally-varying Jacobian matrices of a
coordinate transformation to predict Fourier responses of a warped image. Our
LTEW-based neural function outperforms existing warping methods for
asymmetric-scale SR and homography transform. Furthermore, our algorithm well
generalizes arbitrary coordinate transformations, such as homography transform
with a large magnification factor and equirectangular projection (ERP)
perspective transform, which are not provided in training.
Related papers
- Misalignment-Robust Frequency Distribution Loss for Image Transformation [51.0462138717502]
This paper aims to address a common challenge in deep learning-based image transformation methods, such as image enhancement and super-resolution.
We introduce a novel and simple Frequency Distribution Loss (FDL) for computing distribution distance within the frequency domain.
Our method is empirically proven effective as a training constraint due to the thoughtful utilization of global information in the frequency domain.
arXiv Detail & Related papers (2024-02-28T09:27:41Z) - Hyper-VolTran: Fast and Generalizable One-Shot Image to 3D Object
Structure via HyperNetworks [53.67497327319569]
We introduce a novel neural rendering technique to solve image-to-3D from a single view.
Our approach employs the signed distance function as the surface representation and incorporates generalizable priors through geometry-encoding volumes and HyperNetworks.
Our experiments show the advantages of our proposed approach with consistent results and rapid generation.
arXiv Detail & Related papers (2023-12-24T08:42:37Z) - Entropy Transformer Networks: A Learning Approach via Tangent Bundle
Data Manifold [8.893886200299228]
This paper focuses on an accurate and fast approach for image transformation employed in the design of CNN architectures.
A novel Entropy STN (ESTN) is proposed that interpolates on the data manifold distributions.
Experiments on challenging benchmarks show that the proposed ESTN can improve predictive accuracy over a range of computer vision tasks.
arXiv Detail & Related papers (2023-07-24T04:21:51Z) - RecRecNet: Rectangling Rectified Wide-Angle Images by Thin-Plate Spline
Model and DoF-based Curriculum Learning [62.86400614141706]
We propose a new learning model, i.e., Rectangling Rectification Network (RecRecNet)
Our model can flexibly warp the source structure to the target domain and achieves an end-to-end unsupervised deformation.
Experiments show the superiority of our solution over the compared methods on both quantitative and qualitative evaluations.
arXiv Detail & Related papers (2023-01-04T15:12:57Z) - AbHE: All Attention-based Homography Estimation [0.0]
We propose a strong-baseline model based on the Swin Transformer, which combines convolution neural network for local features and transformer module for global features.
In the homography regression stage, we adopt an attention layer for the channels of correlation volume, which can drop out some weak correlation feature points.
The experiment shows that in 8 Degree-of-Freedoms(DOFs) homography estimation our method overperforms the state-of-the-art method.
arXiv Detail & Related papers (2022-12-06T15:00:00Z) - DynaST: Dynamic Sparse Transformer for Exemplar-Guided Image Generation [56.514462874501675]
We propose a dynamic sparse attention based Transformer model to achieve fine-level matching with favorable efficiency.
The heart of our approach is a novel dynamic-attention unit, dedicated to covering the variation on the optimal number of tokens one position should focus on.
Experiments on three applications, pose-guided person image generation, edge-based face synthesis, and undistorted image style transfer, demonstrate that DynaST achieves superior performance in local details.
arXiv Detail & Related papers (2022-07-13T11:12:03Z) - A training-free recursive multiresolution framework for diffeomorphic
deformable image registration [6.929709872589039]
We propose a novel diffeomorphic training-free approach for deformable image registration.
The proposed architecture is simple in design. The moving image is warped successively at each resolution and finally aligned to the fixed image.
The entire system is end-to-end and optimized for each pair of images from scratch.
arXiv Detail & Related papers (2022-02-01T15:17:17Z) - CSformer: Bridging Convolution and Transformer for Compressive Sensing [65.22377493627687]
This paper proposes a hybrid framework that integrates the advantages of leveraging detailed spatial information from CNN and the global context provided by transformer for enhanced representation learning.
The proposed approach is an end-to-end compressive image sensing method, composed of adaptive sampling and recovery.
The experimental results demonstrate the effectiveness of the dedicated transformer-based architecture for compressive sensing.
arXiv Detail & Related papers (2021-12-31T04:37:11Z) - Invariant Deep Compressible Covariance Pooling for Aerial Scene
Categorization [80.55951673479237]
We propose a novel invariant deep compressible covariance pooling (IDCCP) to solve nuisance variations in aerial scene categorization.
We conduct extensive experiments on the publicly released aerial scene image data sets and demonstrate the superiority of this method compared with state-of-the-art methods.
arXiv Detail & Related papers (2020-11-11T11:13:07Z) - MDReg-Net: Multi-resolution diffeomorphic image registration using fully
convolutional networks with deep self-supervision [2.0178765779788486]
We present a diffeomorphic image registration algorithm to learn spatial transformations between pairs of images to be registered using fully convolutional networks (FCNs)
The network is trained to estimate diffeomorphic spatial transformations between pairs of images by maximizing an image-wise similarity metric between fixed and warped moving images.
Experimental results for registering high resolution 3D structural brain magnetic resonance (MR) images have demonstrated that image registration networks trained by our method obtain robust, diffeomorphic image registration results within seconds.
arXiv Detail & Related papers (2020-10-04T02:00:37Z) - Fast Symmetric Diffeomorphic Image Registration with Convolutional
Neural Networks [11.4219428942199]
We present a novel, efficient unsupervised symmetric image registration method.
We evaluate our method on 3D image registration with a large scale brain image dataset.
arXiv Detail & Related papers (2020-03-20T22:07:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.