WINE: Wavelet-Guided GAN Inversion and Editing for High-Fidelity Refinement
- URL: http://arxiv.org/abs/2210.09655v2
- Date: Tue, 14 Jan 2025 14:22:05 GMT
- Title: WINE: Wavelet-Guided GAN Inversion and Editing for High-Fidelity Refinement
- Authors: Chaewon Kim, Seung-Jun Moon, Gyeong-Moon Park,
- Abstract summary: WINE is a Wavelet-guided GAN Inversion aNd Editing model, which transfers the high-frequency information through wavelet coefficients.
We show WINE outperforms existing state-of-the-art GAN inversion models with a fine balance between editability and reconstruction quality.
- Score: 9.517232831394459
- License:
- Abstract: Recent advanced GAN inversion models aim to convey high-fidelity information from original images to generators through methods using generator tuning or high-dimensional feature learning. Despite these efforts, accurately reconstructing image-specific details remains as a challenge due to the inherent limitations both in terms of training and structural aspects, leading to a bias towards low-frequency information. In this paper, we look into the widely used pixel loss in GAN inversion, revealing its predominant focus on the reconstruction of low-frequency features. We then propose WINE, a Wavelet-guided GAN Inversion aNd Editing model, which transfers the high-frequency information through wavelet coefficients via newly proposed wavelet loss and wavelet fusion scheme. Notably, WINE is the first attempt to interpret GAN inversion in the frequency domain. Our experimental results showcase the precision of WINE in preserving high-frequency details and enhancing image quality. Even in editing scenarios, WINE outperforms existing state-of-the-art GAN inversion models with a fine balance between editability and reconstruction quality.
Related papers
- Local Implicit Wavelet Transformer for Arbitrary-Scale Super-Resolution [15.610136214020947]
Implicit neural representations have recently demonstrated promising potential in arbitrary-scale Super-Resolution (SR) of images.
Most existing methods predict the pixel in the SR image based on the queried coordinate and ensemble nearby features.
We propose the Local Implicit Wavelet Transformer (LIWT) to enhance the restoration of high-frequency texture details.
arXiv Detail & Related papers (2024-11-10T12:21:14Z) - HiFi Tuner: High-Fidelity Subject-Driven Fine-Tuning for Diffusion
Models [56.112302700630806]
We introduce an innovative algorithm named HiFi Tuner to enhance the appearance preservation of objects during personalized image generation.
Key enhancements include the utilization of mask guidance, a novel parameter regularization technique, and the incorporation of step-wise subject representations.
We extend our method to a novel image editing task: substituting the subject in an image through textual manipulations.
arXiv Detail & Related papers (2023-11-30T02:33:29Z) - Reconstruct-and-Generate Diffusion Model for Detail-Preserving Image
Denoising [16.43285056788183]
We propose a novel approach called the Reconstruct-and-Generate Diffusion Model (RnG)
Our method leverages a reconstructive denoising network to recover the majority of the underlying clean signal.
It employs a diffusion algorithm to generate residual high-frequency details, thereby enhancing visual quality.
arXiv Detail & Related papers (2023-09-19T16:01:20Z) - Stage-by-stage Wavelet Optimization Refinement Diffusion Model for
Sparse-View CT Reconstruction [14.037398189132468]
We present an innovative approach named the Stage-by-stage Wavelet Optimization Refinement Diffusion (SWORD) model for sparse-view CT reconstruction.
Specifically, we establish a unified mathematical model integrating low-frequency and high-frequency generative models, achieving the solution with optimization procedure.
Our method rooted in established optimization theory, comprising three distinct stages, including low-frequency generation, high-frequency refinement and domain transform.
arXiv Detail & Related papers (2023-08-30T10:48:53Z) - Multi-stage image denoising with the wavelet transform [125.2251438120701]
Deep convolutional neural networks (CNNs) are used for image denoising via automatically mining accurate structure information.
We propose a multi-stage image denoising CNN with the wavelet transform (MWDCNN) via three stages, i.e., a dynamic convolutional block (DCB), two cascaded wavelet transform and enhancement blocks (WEBs) and residual block (RB)
arXiv Detail & Related papers (2022-09-26T03:28:23Z) - Cross-Modality High-Frequency Transformer for MR Image Super-Resolution [100.50972513285598]
We build an early effort to build a Transformer-based MR image super-resolution framework.
We consider two-fold domain priors including the high-frequency structure prior and the inter-modality context prior.
We establish a novel Transformer architecture, called Cross-modality high-frequency Transformer (Cohf-T), to introduce such priors into super-resolving the low-resolution images.
arXiv Detail & Related papers (2022-03-29T07:56:55Z) - FreqNet: A Frequency-domain Image Super-Resolution Network with Dicrete
Cosine Transform [16.439669339293747]
Single image super-resolution(SISR) is an ill-posed problem that aims to obtain high-resolution (HR) output from low-resolution (LR) input.
Despite the high peak signal-to-noise ratios(PSNR) results, it is difficult to determine whether the model correctly adds desired high-frequency details.
We propose FreqNet, an intuitive pipeline from the frequency domain perspective, to solve this problem.
arXiv Detail & Related papers (2021-11-21T11:49:12Z) - High-Fidelity GAN Inversion for Image Attribute Editing [61.966946442222735]
We present a novel high-fidelity generative adversarial network (GAN) inversion framework that enables attribute editing with image-specific details well-preserved.
With a low bit-rate latent code, previous works have difficulties in preserving high-fidelity details in reconstructed and edited images.
We propose a distortion consultation approach that employs a distortion map as a reference for high-fidelity reconstruction.
arXiv Detail & Related papers (2021-09-14T11:23:48Z) - Progressive Training of Multi-level Wavelet Residual Networks for Image
Denoising [80.10533234415237]
This paper presents a multi-level wavelet residual network (MWRN) architecture as well as a progressive training scheme to improve image denoising performance.
Experiments on both synthetic and real-world noisy images show that our PT-MWRN performs favorably against the state-of-the-art denoising methods.
arXiv Detail & Related papers (2020-10-23T14:14:00Z) - Wavelet Integrated CNNs for Noise-Robust Image Classification [51.18193090255933]
We enhance CNNs by replacing max-pooling, strided-convolution, and average-pooling with Discrete Wavelet Transform (DWT)
WaveCNets, the wavelet integrated versions of VGG, ResNets, and DenseNet, achieve higher accuracy and better noise-robustness than their vanilla versions.
arXiv Detail & Related papers (2020-05-07T09:10:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.