Misalignment-Robust Frequency Distribution Loss for Image Transformation
- URL: http://arxiv.org/abs/2402.18192v1
- Date: Wed, 28 Feb 2024 09:27:41 GMT
- Title: Misalignment-Robust Frequency Distribution Loss for Image Transformation
- Authors: Zhangkai Ni, Juncheng Wu, Zian Wang, Wenhan Yang, Hanli Wang, Lin Ma
- Abstract summary: This paper aims to address a common challenge in deep learning-based image transformation methods, such as image enhancement and super-resolution.
We introduce a novel and simple Frequency Distribution Loss (FDL) for computing distribution distance within the frequency domain.
Our method is empirically proven effective as a training constraint due to the thoughtful utilization of global information in the frequency domain.
- Score: 51.0462138717502
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper aims to address a common challenge in deep learning-based image
transformation methods, such as image enhancement and super-resolution, which
heavily rely on precisely aligned paired datasets with pixel-level alignments.
However, creating precisely aligned paired images presents significant
challenges and hinders the advancement of methods trained on such data. To
overcome this challenge, this paper introduces a novel and simple Frequency
Distribution Loss (FDL) for computing distribution distance within the
frequency domain. Specifically, we transform image features into the frequency
domain using Discrete Fourier Transformation (DFT). Subsequently, frequency
components (amplitude and phase) are processed separately to form the FDL loss
function. Our method is empirically proven effective as a training constraint
due to the thoughtful utilization of global information in the frequency
domain. Extensive experimental evaluations, focusing on image enhancement and
super-resolution tasks, demonstrate that FDL outperforms existing
misalignment-robust loss functions. Furthermore, we explore the potential of
our FDL for image style transfer that relies solely on completely misaligned
data. Our code is available at: https://github.com/eezkni/FDL
Related papers
- FreqINR: Frequency Consistency for Implicit Neural Representation with Adaptive DCT Frequency Loss [5.349799154834945]
This paper introduces Frequency Consistency for Implicit Neural Representation (FreqINR), an innovative Arbitrary-scale Super-resolution method.
During training, we employ Adaptive Discrete Cosine Transform Frequency Loss (ADFL) to minimize the frequency gap between HR and ground-truth images.
During inference, we extend the receptive field to preserve spectral coherence between low-resolution (LR) and ground-truth images.
arXiv Detail & Related papers (2024-08-25T03:53:17Z) - Fine-grained Image-to-LiDAR Contrastive Distillation with Visual Foundation Models [55.99654128127689]
Visual Foundation Models (VFMs) are used to enhance 3D representation learning.
VFMs generate semantic labels for weakly-supervised pixel-to-point contrastive distillation.
We adapt sampling probabilities of points to address imbalances in spatial distribution and category frequency.
arXiv Detail & Related papers (2024-05-23T07:48:19Z) - Low-light Image Enhancement via CLIP-Fourier Guided Wavelet Diffusion [28.049668999586583]
We propose a novel and robust low-light image enhancement method via CLIP-Fourier Guided Wavelet Diffusion, abbreviated as CFWD.
CFWD leverages multimodal visual-language information in the frequency domain space created by multiple wavelet transforms to guide the enhancement process.
Our approach outperforms existing state-of-the-art methods, achieving significant progress in image quality and noise suppression.
arXiv Detail & Related papers (2024-01-08T10:08:48Z) - Frequency-Aware Transformer for Learned Image Compression [64.28698450919647]
We propose a frequency-aware transformer (FAT) block that for the first time achieves multiscale directional ananlysis for Learned Image Compression (LIC)
The FAT block comprises frequency-decomposition window attention (FDWA) modules to capture multiscale and directional frequency components of natural images.
We also introduce frequency-modulation feed-forward network (FMFFN) to adaptively modulate different frequency components, improving rate-distortion performance.
arXiv Detail & Related papers (2023-10-25T05:59:25Z) - Efficient Frequency Domain-based Transformers for High-Quality Image
Deblurring [39.720032882926176]
We present an effective and efficient method that explores the properties of Transformers in the frequency domain for high-quality image deblurring.
We formulate the proposed FSAS and DFFN into an asymmetrical network based on an encoder and decoder architecture.
arXiv Detail & Related papers (2022-11-22T13:08:03Z) - Contextual Learning in Fourier Complex Field for VHR Remote Sensing
Images [64.84260544255477]
transformer-based models demonstrated outstanding potential for learning high-order contextual relationships from natural images with general resolution (224x224 pixels)
We propose a complex self-attention (CSA) mechanism to model the high-order contextual information with less than half computations of naive SA.
By stacking various layers of CSA blocks, we propose the Fourier Complex Transformer (FCT) model to learn global contextual information from VHR aerial images.
arXiv Detail & Related papers (2022-10-28T08:13:33Z) - FCL-GAN: A Lightweight and Real-Time Baseline for Unsupervised Blind
Image Deblurring [72.43250555622254]
We propose a lightweight and real-time unsupervised BID baseline, termed Frequency-domain Contrastive Loss Constrained Lightweight CycleGAN.
FCL-GAN has attractive properties, i.e., no image domain limitation, no image resolution limitation, 25x lighter than SOTA, and 5x faster than SOTA.
Experiments on several image datasets demonstrate the effectiveness of FCL-GAN in terms of performance, model size and reference time.
arXiv Detail & Related papers (2022-04-16T15:08:03Z) - F-Drop&Match: GANs with a Dead Zone in the High-Frequency Domain [12.290010554180613]
We introduce two novel training techniques called frequency dropping (F-Drop) and frequency matching (F-Match)
F-Drop filters out unnecessary high-frequency components from the input images of the discriminators.
F-Match minimizes the difference between real and fake images in the frequency domain for generating more realistic images.
arXiv Detail & Related papers (2021-06-04T08:51:58Z) - Frequency Consistent Adaptation for Real World Super Resolution [64.91914552787668]
We propose a novel Frequency Consistent Adaptation (FCA) that ensures the frequency domain consistency when applying Super-Resolution (SR) methods to the real scene.
We estimate degradation kernels from unsupervised images and generate the corresponding Low-Resolution (LR) images.
Based on the domain-consistent LR-HR pairs, we train easy-implemented Convolutional Neural Network (CNN) SR models.
arXiv Detail & Related papers (2020-12-18T08:25:39Z) - Frequency Domain Image Translation: More Photo-realistic, Better
Identity-preserving [36.606114597585396]
We propose a novel frequency domain image translation framework, exploiting frequency information for enhancing the image generation process.
Our key idea is to decompose the image into low-frequency and high-frequency components, where the high-frequency feature captures object structure akin to the identity.
Extensive experiments and ablations show that FDIT effectively preserves the identity of the source image, and produces photo-realistic images.
arXiv Detail & Related papers (2020-11-27T08:58:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.