Local-Global Context-Aware and Structure-Preserving Image Super-Resolution
- URL: http://arxiv.org/abs/2510.13649v1
- Date: Sat, 11 Oct 2025 07:17:31 GMT
- Title: Local-Global Context-Aware and Structure-Preserving Image Super-Resolution
- Authors: Sanchar Palit, Subhasis Chaudhuri, Biplab Banerjee,
- Abstract summary: Pretrained text-to-image models, such as Stable Diffusion, have exhibited strong capabilities in synthesizing realistic image content.<n>We propose a contextually precise image super-resolution framework that effectively maintains both local and global pixel relationships.
- Score: 23.87231269881077
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Diffusion models have recently achieved significant success in various image manipulation tasks, including image super-resolution and perceptual quality enhancement. Pretrained text-to-image models, such as Stable Diffusion, have exhibited strong capabilities in synthesizing realistic image content, which makes them particularly attractive for addressing super-resolution tasks. While some existing approaches leverage these models to achieve state-of-the-art results, they often struggle when applied to diverse and highly degraded images, leading to noise amplification or incorrect content generation. To address these limitations, we propose a contextually precise image super-resolution framework that effectively maintains both local and global pixel relationships through Local-Global Context-Aware Attention, enabling the generation of high-quality images. Furthermore, we propose a distribution- and perceptual-aligned conditioning mechanism in the pixel space to enhance perceptual fidelity. This mechanism captures fine-grained pixel-level representations while progressively preserving and refining structural information, transitioning from local content details to the global structural composition. During inference, our method generates high-quality images that are structurally consistent with the original content, mitigating artifacts and ensuring realistic detail restoration. Extensive experiments on multiple super-resolution benchmarks demonstrate the effectiveness of our approach in producing high-fidelity, perceptually accurate reconstructions.
Related papers
- Edit2Perceive: Image Editing Diffusion Models Are Strong Dense Perceivers [55.15722080205737]
Edit2Perceive is a unified diffusion framework that adapts editing models for depth, normal, and matting.<n>Our single-step deterministic inference yields up to faster runtime while training on relatively small datasets.
arXiv Detail & Related papers (2025-11-24T01:13:51Z) - Perceive, Understand and Restore: Real-World Image Super-Resolution with Autoregressive Multimodal Generative Models [33.76031793753807]
We adapt the autoregressive multimodal model Lumina-mGPT into a robust Real-ISR model, namely PURE.<n>PURE Perceives and Understands the input low-quality image, then REstores its high-quality counterpart.<n> Experimental results demonstrate that PURE preserves image content while generating realistic details.
arXiv Detail & Related papers (2025-03-14T04:33:59Z) - Multi-Scale Diffusion: Enhancing Spatial Layout in High-Resolution Panoramic Image Generation [12.588962705218103]
This paper introduces the Multi-Scale Diffusion (MSD), an optimized framework that extends the panoramic image generation framework to multiple resolution levels.<n>Our method leverages gradient descent techniques to incorporate structural information from low-resolution images into high-resolution outputs.
arXiv Detail & Related papers (2024-10-24T15:18:51Z) - DeeDSR: Towards Real-World Image Super-Resolution via Degradation-Aware Stable Diffusion [27.52552274944687]
We introduce a novel two-stage, degradation-aware framework that enhances the diffusion model's ability to recognize content and degradation in low-resolution images.
In the first stage, we employ unsupervised contrastive learning to obtain representations of image degradations.
In the second stage, we integrate a degradation-aware module into a simplified ControlNet, enabling flexible adaptation to various degradations.
arXiv Detail & Related papers (2024-03-31T12:07:04Z) - CoSeR: Bridging Image and Language for Cognitive Super-Resolution [74.24752388179992]
We introduce the Cognitive Super-Resolution (CoSeR) framework, empowering SR models with the capacity to comprehend low-resolution images.
We achieve this by marrying image appearance and language understanding to generate a cognitive embedding.
To further improve image fidelity, we propose a novel condition injection scheme called "All-in-Attention"
arXiv Detail & Related papers (2023-11-27T16:33:29Z) - Pixel-Inconsistency Modeling for Image Manipulation Localization [59.968362815126326]
Digital image forensics plays a crucial role in image authentication and manipulation localization.
This paper presents a generalized and robust manipulation localization model through the analysis of pixel inconsistency artifacts.
Experiments show that our method successfully extracts inherent pixel-inconsistency forgery fingerprints.
arXiv Detail & Related papers (2023-09-30T02:54:51Z) - Learning Enriched Features for Fast Image Restoration and Enhancement [166.17296369600774]
This paper presents a holistic goal of maintaining spatially-precise high-resolution representations through the entire network.
We learn an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
Our approach achieves state-of-the-art results for a variety of image processing tasks, including defocus deblurring, image denoising, super-resolution, and image enhancement.
arXiv Detail & Related papers (2022-04-19T17:59:45Z) - High-Quality Pluralistic Image Completion via Code Shared VQGAN [51.7805154545948]
We present a novel framework for pluralistic image completion that can achieve both high quality and diversity at much faster inference speed.
Our framework is able to learn semantically-rich discrete codes efficiently and robustly, resulting in much better image reconstruction quality.
arXiv Detail & Related papers (2022-04-05T01:47:35Z) - Efficient texture-aware multi-GAN for image inpainting [5.33024001730262]
Recent GAN-based (Generative adversarial networks) inpainting methods show remarkable improvements.
We propose a multi-GAN architecture improving both the performance and rendering efficiency.
arXiv Detail & Related papers (2020-09-30T14:58:03Z) - Learning Enriched Features for Real Image Restoration and Enhancement [166.17296369600774]
convolutional neural networks (CNNs) have achieved dramatic improvements over conventional approaches for image restoration task.
We present a novel architecture with the collective goals of maintaining spatially-precise high-resolution representations through the entire network.
Our approach learns an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
arXiv Detail & Related papers (2020-03-15T11:04:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.