Universal Face Restoration With Memorized Modulation
- URL: http://arxiv.org/abs/2110.01033v1
- Date: Sun, 3 Oct 2021 15:55:07 GMT
- Title: Universal Face Restoration With Memorized Modulation
- Authors: Jia Li, Huaibo Huang, Xiaofei Jia, Ran He
- Abstract summary: This paper proposes a Restoration with Memorized Modulation (RMM) framework for universal Blind Face Restoration (BFR)
We apply random noise as well as unsupervised wavelet memory to adaptively modulate the face-enhancement generator.
Experimental results show the superiority of the proposed method compared with the state-of-the-art methods, and a good generalization in the wild.
- Score: 73.34750780570909
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Blind face restoration (BFR) is a challenging problem because of the
uncertainty of the degradation patterns. This paper proposes a Restoration with
Memorized Modulation (RMM) framework for universal BFR in diverse degraded
scenes and heterogeneous domains. We apply random noise as well as unsupervised
wavelet memory to adaptively modulate the face-enhancement generator,
considering attentional denormalization in both layer and instance levels.
Specifically, in the training stage, the low-level spatial feature embedding,
the wavelet memory embedding obtained by wavelet transformation of the
high-resolution image, as well as the disentangled high-level noise embeddings
are integrated, with the guidance of attentional maps generated from layer
normalization, instance normalization, and the original feature map. These
three embeddings are respectively associated with the spatial content,
high-frequency texture details, and a learnable universal prior against other
blind image degradation patterns. We store the spatial feature of the
low-resolution image and the corresponding wavelet style code as key and value
in the memory unit, respectively. In the test stage, the wavelet memory value
whose corresponding spatial key is the most matching with that of the inferred
image is retrieved to modulate the generator. Moreover, the universal prior
learned from the random noise has been memorized by training the modulation
network. Experimental results show the superiority of the proposed method
compared with the state-of-the-art methods, and a good generalization in the
wild.
Related papers
- Wavelet-based Variational Autoencoders for High-Resolution Image Generation [0.0]
Variational Autoencoders (VAEs) are powerful generative models capable of learning compact latent representations.
In this paper, we explore a novel wavelet-based approach (Wavelet-VAE) in which the latent space is constructed using multi-scale Haar wavelet coefficients.
arXiv Detail & Related papers (2025-04-16T13:51:41Z) - A Hybrid Wavelet-Fourier Method for Next-Generation Conditional Diffusion Models [0.0]
We present a novel generative modeling framework,Wavelet-Fourier-Diffusion, which adapts the diffusion paradigm to hybrid frequency representations.
We show how the hybrid frequency-based representation improves control over global coherence and fine texture synthesis.
arXiv Detail & Related papers (2025-04-04T17:11:04Z) - Feature Alignment with Equivariant Convolutions for Burst Image Super-Resolution [52.55429225242423]
We propose a novel framework for Burst Image Super-Resolution (BISR), featuring an equivariant convolution-based alignment.
This enables the alignment transformation to be learned via explicit supervision in the image domain and easily applied in the feature domain.
Experiments on BISR benchmarks show the superior performance of our approach in both quantitative metrics and visual quality.
arXiv Detail & Related papers (2025-03-11T11:13:10Z) - Physics-informed DeepCT: Sinogram Wavelet Decomposition Meets Masked Diffusion [9.126628956920904]
Diffusion model shows remarkable potential on sparse-view computed tomography (SVCT) reconstruction.
We propose a Sinogram-based Wavelet random decomposition And Random mask diffusion Model (SWARM) for SVCT reconstruction.
arXiv Detail & Related papers (2025-01-17T03:16:15Z) - AdaIR: Adaptive All-in-One Image Restoration via Frequency Mining and Modulation [99.57024606542416]
We propose an adaptive all-in-one image restoration network based on frequency mining and modulation.
Our approach is motivated by the observation that different degradation types impact the image content on different frequency subbands.
The proposed model achieves adaptive reconstruction by accentuating the informative frequency subbands according to different input degradations.
arXiv Detail & Related papers (2024-03-21T17:58:14Z) - BFRFormer: Transformer-based generator for Real-World Blind Face
Restoration [37.77996097891398]
We propose a Transformer-based blind face restoration method, named BFRFormer, to reconstruct images with more identity-preserved details in an end-to-end manner.
Our method outperforms state-of-the-art methods on a synthetic dataset and four real-world datasets.
arXiv Detail & Related papers (2024-02-29T02:31:54Z) - Frequency-Adaptive Pan-Sharpening with Mixture of Experts [22.28680499480492]
We propose a novel Frequency Adaptive Mixture of Experts (FAME) learning framework for pan-sharpening.
Our method performs the best against other state-of-the-art ones and comprises a strong generalization ability for real-world scenes.
arXiv Detail & Related papers (2024-01-04T08:58:25Z) - DGNet: Dynamic Gradient-Guided Network for Water-Related Optics Image
Enhancement [77.0360085530701]
Underwater image enhancement (UIE) is a challenging task due to the complex degradation caused by underwater environments.
Previous methods often idealize the degradation process, and neglect the impact of medium noise and object motion on the distribution of image features.
Our approach utilizes predicted images to dynamically update pseudo-labels, adding a dynamic gradient to optimize the network's gradient space.
arXiv Detail & Related papers (2023-12-12T06:07:21Z) - Unified Frequency-Assisted Transformer Framework for Detecting and
Grounding Multi-Modal Manipulation [109.1912721224697]
We present the Unified Frequency-Assisted transFormer framework, named UFAFormer, to address the DGM4 problem.
By leveraging the discrete wavelet transform, we decompose images into several frequency sub-bands, capturing rich face forgery artifacts.
Our proposed frequency encoder, incorporating intra-band and inter-band self-attentions, explicitly aggregates forgery features within and across diverse sub-bands.
arXiv Detail & Related papers (2023-09-18T11:06:42Z) - WaveFill: A Wavelet-based Generation Network for Image Inpainting [57.012173791320855]
WaveFill is a wavelet-based inpainting network that decomposes images into multiple frequency bands.
WaveFill decomposes images by using discrete wavelet transform (DWT) that preserves spatial information naturally.
It applies L1 reconstruction loss to the low-frequency bands and adversarial loss to high-frequency bands, hence effectively mitigate inter-frequency conflicts.
arXiv Detail & Related papers (2021-07-23T04:44:40Z) - Learning Omni-frequency Region-adaptive Representations for Real Image
Super-Resolution [37.74756727980146]
Key to solving real image super-resolution (RealSR) problem lies in learning feature representations that are both informative and content-aware.
In this paper, we propose an Omni-frequency Region-adaptive Network (ORNet) to address both challenges.
arXiv Detail & Related papers (2020-12-11T05:17:38Z) - Non-local Meets Global: An Iterative Paradigm for Hyperspectral Image
Restoration [66.68541690283068]
We propose a unified paradigm combining the spatial and spectral properties for hyperspectral image restoration.
The proposed paradigm enjoys performance superiority from the non-local spatial denoising and light computation complexity.
Experiments on HSI denoising, compressed reconstruction, and inpainting tasks, with both simulated and real datasets, demonstrate its superiority.
arXiv Detail & Related papers (2020-10-24T15:53:56Z) - Generalized Octave Convolutions for Learned Multi-Frequency Image
Compression [20.504561050200365]
We propose the first learned multi-frequency image compression and entropy coding approach.
It is based on the recently developed octave convolutions to factorize the latents into high and low frequency (resolution) components.
We show that the proposed generalized octave convolution can improve the performance of other auto-encoder-based computer vision tasks.
arXiv Detail & Related papers (2020-02-24T01:35:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.