Focal Frequency Loss for Image Reconstruction and Synthesis
- URL: http://arxiv.org/abs/2012.12821v2
- Date: Sun, 4 Apr 2021 09:20:30 GMT
- Title: Focal Frequency Loss for Image Reconstruction and Synthesis
- Authors: Liming Jiang, Bo Dai, Wayne Wu, Chen Change Loy
- Abstract summary: We show that narrowing gaps in the frequency domain can ameliorate image reconstruction and synthesis quality further.
We propose a novel focal frequency loss, which allows a model to adaptively focus on frequency components that are hard to synthesize.
- Score: 125.7135706352493
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image reconstruction and synthesis have witnessed remarkable progress thanks
to the development of generative models. Nonetheless, gaps could still exist
between the real and generated images, especially in the frequency domain. In
this study, we show that narrowing gaps in the frequency domain can ameliorate
image reconstruction and synthesis quality further. We propose a novel focal
frequency loss, which allows a model to adaptively focus on frequency
components that are hard to synthesize by down-weighting the easy ones. This
objective function is complementary to existing spatial losses, offering great
impedance against the loss of important frequency information due to the
inherent bias of neural networks. We demonstrate the versatility and
effectiveness of focal frequency loss to improve popular models, such as VAE,
pix2pix, and SPADE, in both perceptual quality and quantitative performance. We
further show its potential on StyleGAN2.
Related papers
- Few-shot NeRF by Adaptive Rendering Loss Regularization [78.50710219013301]
Novel view synthesis with sparse inputs poses great challenges to Neural Radiance Field (NeRF)
Recent works demonstrate that the frequency regularization of Positional rendering can achieve promising results for few-shot NeRF.
We propose Adaptive Rendering loss regularization for few-shot NeRF, dubbed AR-NeRF.
arXiv Detail & Related papers (2024-10-23T13:05:26Z) - Limited-View Photoacoustic Imaging Reconstruction Via High-quality Self-supervised Neural Representation [4.274771298029378]
We introduce a self-supervised network termed HIgh-quality Self-supervised neural representation (HIS)
HIS tackles the inverse problem of photoacoustic imaging to reconstruct high-quality photoacoustic images from sensor data acquired under limited viewpoints.
Results indicate that the proposed HIS model offers superior image reconstruction quality compared to three commonly used methods for photoacoustic image reconstruction.
arXiv Detail & Related papers (2024-07-04T06:07:54Z) - HFGS: 4D Gaussian Splatting with Emphasis on Spatial and Temporal High-Frequency Components for Endoscopic Scene Reconstruction [13.012536387221669]
Robot-assisted minimally invasive surgery benefits from enhancing dynamic scene reconstruction, as it improves surgical outcomes.
NeRF have been effective in scene reconstruction, but their slow inference speeds and lengthy training durations limit their applicability.
3D Gaussian Splatting (3D-GS) based methods have emerged as a recent trend, offering rapid inference capabilities and superior 3D quality.
In this paper, we propose HFGS, a novel approach for deformable endoscopic reconstruction that addresses these challenges from spatial and temporal frequency perspectives.
arXiv Detail & Related papers (2024-05-28T06:48:02Z) - Holistic Dynamic Frequency Transformer for Image Fusion and Exposure Correction [18.014481087171657]
The correction of exposure-related issues is a pivotal component in enhancing the quality of images.
This paper proposes a novel methodology that leverages the frequency domain to improve and unify the handling of exposure correction tasks.
Our proposed method achieves state-of-the-art results, paving the way for more sophisticated and unified solutions in exposure correction.
arXiv Detail & Related papers (2023-09-03T14:09:14Z) - Spatial-Frequency U-Net for Denoising Diffusion Probabilistic Models [89.76587063609806]
We study the denoising diffusion probabilistic model (DDPM) in wavelet space, instead of pixel space, for visual synthesis.
By explicitly modeling the wavelet signals, we find our model is able to generate images with higher quality on several datasets.
arXiv Detail & Related papers (2023-07-27T06:53:16Z) - Catch Missing Details: Image Reconstruction with Frequency Augmented
Variational Autoencoder [27.149365819904745]
A higher compression rate induces more loss of visual signals on the higher frequency spectrum which reflect the details on pixel space.
A Frequency Complement Module (FCM) architecture is proposed to capture the missing frequency information for enhancing reconstruction quality.
A Cross-attention Autoregressive Transformer (CAT) is proposed to obtain more precise semantic attributes in texts.
arXiv Detail & Related papers (2023-05-04T04:30:21Z) - Harnessing Low-Frequency Neural Fields for Few-Shot View Synthesis [82.31272171857623]
We harness low-frequency neural fields to regularize high-frequency neural fields from overfitting.
We propose a simple-yet-effective strategy for tuning the frequency to avoid overfitting few-shot inputs.
arXiv Detail & Related papers (2023-03-15T05:15:21Z) - DPFNet: A Dual-branch Dilated Network with Phase-aware Fourier
Convolution for Low-light Image Enhancement [1.2645663389012574]
Low-light image enhancement is a classical computer vision problem aiming to recover normal-exposure images from low-light images.
convolutional neural networks commonly used in this field are good at sampling low-frequency local structural features in the spatial domain.
We propose a novel module using the Fourier coefficients, which can recover high-quality texture details under the constraint of semantics in the frequency phase.
arXiv Detail & Related papers (2022-09-16T13:56:09Z) - WaveFill: A Wavelet-based Generation Network for Image Inpainting [57.012173791320855]
WaveFill is a wavelet-based inpainting network that decomposes images into multiple frequency bands.
WaveFill decomposes images by using discrete wavelet transform (DWT) that preserves spatial information naturally.
It applies L1 reconstruction loss to the low-frequency bands and adversarial loss to high-frequency bands, hence effectively mitigate inter-frequency conflicts.
arXiv Detail & Related papers (2021-07-23T04:44:40Z) - Fourier Space Losses for Efficient Perceptual Image Super-Resolution [131.50099891772598]
We show that it is possible to improve the performance of a recently introduced efficient generator architecture solely with the application of our proposed loss functions.
We show that our losses' direct emphasis on the frequencies in Fourier-space significantly boosts the perceptual image quality.
The trained generator achieves comparable results with and is 2.4x and 48x faster than state-of-the-art perceptual SR methods RankSRGAN and SRFlow respectively.
arXiv Detail & Related papers (2021-06-01T20:34:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.