Related papers: IG-CFAT: An Improved GAN-Based Framework for Effectively Exploiting Transformers in Real-World Image Super-Resolution

IG-CFAT: An Improved GAN-Based Framework for Effectively Exploiting Transformers in Real-World Image Super-Resolution

URL: http://arxiv.org/abs/2406.13815v2
Date: Mon, 22 Jul 2024 20:50:09 GMT
Title: IG-CFAT: An Improved GAN-Based Framework for Effectively Exploiting Transformers in Real-World Image Super-Resolution
Authors: Alireza Aghelan, Ali Amiryan, Abolfazl Zarghani, Behnoush Hatami, Modjtaba Rouhani,
Abstract summary: This paper extends the CFAT model to an improved GAN-based model called IG-CFAT. IG-CFAT incorporates a semantic-aware discriminator to reconstruct fine details more accurately. Our methodology adds wavelet loss to conventional loss functions of GAN-based super-resolution models to recover high-frequency details more efficiently.
Score: 2.009766774844269
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In the field of single image super-resolution (SISR), transformer-based models, have demonstrated significant advancements. However, the potential and efficiency of these models in applied fields such as real-world image super-resolution have been less noticed and there are substantial opportunities for improvement. Recently, composite fusion attention transformer (CFAT), outperformed previous state-of-the-art (SOTA) models in classic image super-resolution. This paper extends the CFAT model to an improved GAN-based model called IG-CFAT to effectively exploit the performance of transformers in real-world image super-resolution. IG-CFAT incorporates a semantic-aware discriminator to reconstruct fine details more accurately. Moreover, our model utilizes an adaptive degradation model to better simulate real-world degradations. Our methodology adds wavelet loss to conventional loss functions of GAN-based super-resolution models to recover high-frequency details more efficiently. Empirical results demonstrate that IG-CFAT sets new benchmarks in real-world image super-resolution, outperforming SOTA models in quantitative and qualitative metrics.

Related papers

Enhanced Semantic Extraction and Guidance for UGC Image Super Resolution [18.058473238611725]
We propose a novel approach to image super-resolution by integrating semantic guidance into a diffusion framework. Our method addresses the inconsistency between degradations in wild and synthetic datasets. Our model won second place in the CVIRE 2025 Short-form Image Super-Resolution Challenge.
arXiv Detail & Related papers (2025-04-14T05:26:24Z)
CTSR: Controllable Fidelity-Realness Trade-off Distillation for Real-World Image Super Resolution [52.93785843453579]
Real-world image super-resolution is a critical image processing task, where two key evaluation criteria are the fidelity to the original image and the visual realness of the generated results. We propose a distillation-based approach that leverages the geometric decomposition of both fidelity and realness, alongside the performance advantages of multiple teacher models. Experiments conducted on several real-world image super-resolution benchmarks demonstrate that our method surpasses existing state-of-the-art approaches.
arXiv Detail & Related papers (2025-03-18T14:06:39Z)
Visual Autoregressive Modeling for Image Super-Resolution [14.935662351654601]
We propose a novel visual autoregressive modeling for ISR framework with the form of next-scale prediction. We collect large-scale data and design a training process to obtain robust generative priors.
arXiv Detail & Related papers (2025-01-31T09:53:47Z)
FoundIR: Unleashing Million-scale Training Data to Advance Foundation Models for Image Restoration [66.61201445650323]
Existing methods suffer from a generalization bottleneck in real-world scenarios. We contribute a million-scale dataset with two notable advantages over existing training data. We propose a robust model, FoundIR, to better address a broader range of restoration tasks in real-world scenarios.
arXiv Detail & Related papers (2024-12-02T12:08:40Z)
Degradation-Guided One-Step Image Super-Resolution with Diffusion Priors [75.24313405671433]
Diffusion-based image super-resolution (SR) methods have achieved remarkable success by leveraging large pre-trained text-to-image diffusion models as priors. We introduce a novel one-step SR model, which significantly addresses the efficiency issue of diffusion-based SR methods. Unlike existing fine-tuning strategies, we designed a degradation-guided Low-Rank Adaptation (LoRA) module specifically for SR.
arXiv Detail & Related papers (2024-09-25T16:15:21Z)
Towards Realistic Data Generation for Real-World Super-Resolution [58.88039242455039]
RealDGen is an unsupervised learning data generation framework designed for real-world super-resolution. We develop content and degradation extraction strategies, which are integrated into a novel content-degradation decoupled diffusion model. Experiments demonstrate that RealDGen excels in generating large-scale, high-quality paired data that mirrors real-world degradations.
arXiv Detail & Related papers (2024-06-11T13:34:57Z)
Training Transformer Models by Wavelet Losses Improves Quantitative and Visual Performance in Single Image Super-Resolution [6.367865391518726]
Transformer-based models have achieved remarkable results in low-level vision tasks including image super-resolution (SR) To activate more input pixels globally, hybrid attention models have been proposed. We employ wavelet losses to train Transformer models to improve quantitative and subjective performance.
arXiv Detail & Related papers (2024-04-17T11:25:19Z)
DeeDSR: Towards Real-World Image Super-Resolution via Degradation-Aware Stable Diffusion [27.52552274944687]
We introduce a novel two-stage, degradation-aware framework that enhances the diffusion model's ability to recognize content and degradation in low-resolution images. In the first stage, we employ unsupervised contrastive learning to obtain representations of image degradations. In the second stage, we integrate a degradation-aware module into a simplified ControlNet, enabling flexible adaptation to various degradations.
arXiv Detail & Related papers (2024-03-31T12:07:04Z)
DifAugGAN: A Practical Diffusion-style Data Augmentation for GAN-based Single Image Super-resolution [88.13972071356422]
We propose a diffusion-style data augmentation scheme for GAN-based image super-resolution (SR) methods, known as DifAugGAN. It involves adapting the diffusion process in generative diffusion models for improving the calibration of the discriminator during training. Our DifAugGAN can be a Plug-and-Play strategy for current GAN-based SISR methods to improve the calibration of the discriminator and thus improve SR performance.
arXiv Detail & Related papers (2023-11-30T12:37:53Z)
Implicit Diffusion Models for Continuous Super-Resolution [65.45848137914592]
This paper introduces an Implicit Diffusion Model (IDM) for high-fidelity continuous image super-resolution. IDM integrates an implicit neural representation and a denoising diffusion model in a unified end-to-end framework. The scaling factor regulates the resolution and accordingly modulates the proportion of the LR information and generated features in the final output.
arXiv Detail & Related papers (2023-03-29T07:02:20Z)
Underwater Image Super-Resolution using Generative Adversarial Network-based Model [3.127436744845925]
Single image super-resolution (SISR) models are able to enhance the resolution and visual quality of underwater images. In this paper, we fine-tune the pre-trained Real-ESRGAN model for underwater image super-resolution.
arXiv Detail & Related papers (2022-11-07T13:38:28Z)
Uncovering the Over-smoothing Challenge in Image Super-Resolution: Entropy-based Quantification and Contrastive Optimization [67.99082021804145]
We propose an explicit solution to the COO problem, called Detail Enhanced Contrastive Loss (DECLoss) DECLoss utilizes the clustering property of contrastive learning to directly reduce the variance of the potential high-resolution distribution. We evaluate DECLoss on multiple super-resolution benchmarks and demonstrate that it improves the perceptual quality of PSNR-oriented models.
arXiv Detail & Related papers (2022-01-04T08:30:09Z)
A Generic Approach for Enhancing GANs by Regularized Latent Optimization [79.00740660219256]
We introduce a generic framework called em generative-model inference that is capable of enhancing pre-trained GANs effectively and seamlessly. Our basic idea is to efficiently infer the optimal latent distribution for the given requirements using Wasserstein gradient flow techniques.
arXiv Detail & Related papers (2021-12-07T05:22:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.