CoSeR: Bridging Image and Language for Cognitive Super-Resolution
- URL: http://arxiv.org/abs/2311.16512v4
- Date: Thu, 21 Dec 2023 03:03:40 GMT
- Title: CoSeR: Bridging Image and Language for Cognitive Super-Resolution
- Authors: Haoze Sun, Wenbo Li, Jianzhuang Liu, Haoyu Chen, Renjing Pei, Xueyi
Zou, Youliang Yan, Yujiu Yang
- Abstract summary: We introduce the Cognitive Super-Resolution (CoSeR) framework, empowering SR models with the capacity to comprehend low-resolution images.
We achieve this by marrying image appearance and language understanding to generate a cognitive embedding.
To further improve image fidelity, we propose a novel condition injection scheme called "All-in-Attention"
- Score: 74.24752388179992
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing super-resolution (SR) models primarily focus on restoring local
texture details, often neglecting the global semantic information within the
scene. This oversight can lead to the omission of crucial semantic details or
the introduction of inaccurate textures during the recovery process. In our
work, we introduce the Cognitive Super-Resolution (CoSeR) framework, empowering
SR models with the capacity to comprehend low-resolution images. We achieve
this by marrying image appearance and language understanding to generate a
cognitive embedding, which not only activates prior information from large
text-to-image diffusion models but also facilitates the generation of
high-quality reference images to optimize the SR process. To further improve
image fidelity, we propose a novel condition injection scheme called
"All-in-Attention", consolidating all conditional information into a single
module. Consequently, our method successfully restores semantically correct and
photorealistic details, demonstrating state-of-the-art performance across
multiple benchmarks. Code: https://github.com/VINHYU/CoSeR
Related papers
- SeeSR: Towards Semantics-Aware Real-World Image Super-Resolution [16.815468458589635]
We present a semantics-aware approach to better preserve the semantic fidelity of generative real-world image super-resolution.
First, we train a degradation-aware prompt extractor, which can generate accurate soft and hard semantic prompts even under strong degradation.
The experiments show that our method can reproduce more realistic image details and hold better the semantics.
arXiv Detail & Related papers (2023-11-27T18:11:19Z) - Recognition-Guided Diffusion Model for Scene Text Image Super-Resolution [15.391125077873745]
Scene Text Image Super-Resolution (STISR) aims to enhance the resolution and legibility of text within low-resolution (LR) images.
Previous methods predominantly employ discriminative Convolutional Neural Networks (CNNs) augmented with diverse forms of text guidance.
We introduce RGDiffSR, a Recognition-Guided Diffusion model for scene text image Super-Resolution, which exhibits great generative diversity and fidelity even in challenging scenarios.
arXiv Detail & Related papers (2023-11-22T11:10:45Z) - RBSR: Efficient and Flexible Recurrent Network for Burst
Super-Resolution [57.98314517861539]
Burst super-resolution (BurstSR) aims at reconstructing a high-resolution (HR) image from a sequence of low-resolution (LR) and noisy images.
In this paper, we suggest fusing cues frame-by-frame with an efficient and flexible recurrent network.
arXiv Detail & Related papers (2023-06-30T12:14:13Z) - Rethinking Super-Resolution as Text-Guided Details Generation [21.695227836312835]
We propose a Text-Guided Super-Resolution (TGSR) framework, which can effectively utilize the information from the text and image modalities.
The proposed TGSR could generate HR image details that match the text descriptions through a coarse-to-fine process.
arXiv Detail & Related papers (2022-07-14T01:46:38Z) - Learning Enriched Features for Fast Image Restoration and Enhancement [166.17296369600774]
This paper presents a holistic goal of maintaining spatially-precise high-resolution representations through the entire network.
We learn an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
Our approach achieves state-of-the-art results for a variety of image processing tasks, including defocus deblurring, image denoising, super-resolution, and image enhancement.
arXiv Detail & Related papers (2022-04-19T17:59:45Z) - Best-Buddy GANs for Highly Detailed Image Super-Resolution [71.13466303340192]
We consider the single image super-resolution (SISR) problem, where a high-resolution (HR) image is generated based on a low-resolution (LR) input.
Most methods along this line rely on a predefined single-LR-single-HR mapping, which is not flexible enough for the SISR task.
We propose best-buddy GANs (Beby-GAN) for rich-detail SISR. Relaxing the immutable one-to-one constraint, we allow the estimated patches to dynamically seek the best supervision.
arXiv Detail & Related papers (2021-03-29T02:58:27Z) - Interpretable Detail-Fidelity Attention Network for Single Image
Super-Resolution [89.1947690981471]
We propose a purposeful and interpretable detail-fidelity attention network to progressively process smoothes and details in divide-and-conquer manner.
Particularly, we propose a Hessian filtering for interpretable feature representation which is high-profile for detail inference.
Experiments demonstrate that the proposed methods achieve superior performances over the state-of-the-art methods.
arXiv Detail & Related papers (2020-09-28T08:31:23Z) - Learning Enriched Features for Real Image Restoration and Enhancement [166.17296369600774]
convolutional neural networks (CNNs) have achieved dramatic improvements over conventional approaches for image restoration task.
We present a novel architecture with the collective goals of maintaining spatially-precise high-resolution representations through the entire network.
Our approach learns an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
arXiv Detail & Related papers (2020-03-15T11:04:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.