InstructRestore: Region-Customized Image Restoration with Human Instructions
- URL: http://arxiv.org/abs/2503.24357v1
- Date: Mon, 31 Mar 2025 17:36:05 GMT
- Title: InstructRestore: Region-Customized Image Restoration with Human Instructions
- Authors: Shuaizheng Liu, Jianqi Ma, Lingchen Sun, Xiangtao Kong, Lei Zhang,
- Abstract summary: We propose a new framework, namely InstructRestore, to perform region-adjustable image restoration following human instructions.<n>We first develop a data generation engine to produce training triplets, each consisting of a high-quality image, the target region description, and the corresponding region mask.<n>We then examine how to integrate the low-quality image features under the ControlNet architecture to adjust the degree of image details enhancement.
- Score: 11.32695520392065
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite the significant progress in diffusion prior-based image restoration, most existing methods apply uniform processing to the entire image, lacking the capability to perform region-customized image restoration according to user instructions. In this work, we propose a new framework, namely InstructRestore, to perform region-adjustable image restoration following human instructions. To achieve this, we first develop a data generation engine to produce training triplets, each consisting of a high-quality image, the target region description, and the corresponding region mask. With this engine and careful data screening, we construct a comprehensive dataset comprising 536,945 triplets to support the training and evaluation of this task. We then examine how to integrate the low-quality image features under the ControlNet architecture to adjust the degree of image details enhancement. Consequently, we develop a ControlNet-like model to identify the target region and allocate different integration scales to the target and surrounding regions, enabling region-customized image restoration that aligns with user instructions. Experimental results demonstrate that our proposed InstructRestore approach enables effective human-instructed image restoration, such as images with bokeh effects and user-instructed local enhancement. Our work advances the investigation of interactive image restoration and enhancement techniques. Data, code, and models will be found at https://github.com/shuaizhengliu/InstructRestore.git.
Related papers
- PromptFix: You Prompt and We Fix the Photo [84.69812824355269]
Diffusion models equipped with language models demonstrate excellent controllability in image generation tasks.
The lack of diverse instruction-following data hampers the development of models.
We propose PromptFix, a framework that enables diffusion models to follow human instructions.
arXiv Detail & Related papers (2024-05-27T03:13:28Z) - InstructIR: High-Quality Image Restoration Following Human Instructions [61.1546287323136]
We present the first approach that uses human-written instructions to guide the image restoration model.
Our method, InstructIR, achieves state-of-the-art results on several restoration tasks.
arXiv Detail & Related papers (2024-01-29T18:53:33Z) - Restoration by Generation with Constrained Priors [25.906981634736795]
We propose a method to adapt a pretrained diffusion model for image restoration by simply adding noise to the input image to be restored and then denoise.
We show superior performances on multiple real-world restoration datasets in preserving identity and image quality.
This approach allows us to produce results that accurately preserve high-frequency details, which previous works are unable to do.
arXiv Detail & Related papers (2023-12-28T17:50:54Z) - SPIRE: Semantic Prompt-Driven Image Restoration [66.26165625929747]
We develop SPIRE, a Semantic and restoration Prompt-driven Image Restoration framework.
Our approach is the first framework that supports fine-level instruction through language-based quantitative specification of the restoration strength.
Our experiments demonstrate the superior restoration performance of SPIRE compared to the state of the arts.
arXiv Detail & Related papers (2023-12-18T17:02:30Z) - PRISM: Progressive Restoration for Scene Graph-based Image Manipulation [47.77003316561398]
PRISM is a novel multi-head image manipulation approach to improve the accuracy and quality of the manipulated regions in the scene.
Our results demonstrate the potential of our approach for enhancing the quality and precision of scene graph-based image manipulation.
arXiv Detail & Related papers (2023-11-03T21:30:34Z) - Revisiting Image Reconstruction for Semi-supervised Semantic
Segmentation [16.27277238968567]
We revisit the idea of using image reconstruction as an auxiliary task and incorporate it with a modern semi-supervised semantic segmentation framework.
Surprisingly, we discover that such an old idea in semi-supervised learning can produce results competitive with state-of-the-art semantic segmentation algorithms.
arXiv Detail & Related papers (2023-03-17T06:31:06Z) - Efficient and Explicit Modelling of Image Hierarchies for Image
Restoration [120.35246456398738]
We propose a mechanism to efficiently and explicitly model image hierarchies in the global, regional, and local range for image restoration.
Inspired by that, we propose the anchored stripe self-attention which achieves a good balance between the space and time complexity of self-attention.
Then we propose a new network architecture dubbed GRL to explicitly model image hierarchies in the Global, Regional, and Local range.
arXiv Detail & Related papers (2023-03-01T18:59:29Z) - Image Restoration using Feature-guidance [43.02281823557039]
We present a new approach suitable for handling the image-specific and spatially-varying nature of degradation in images.
We decompose the restoration task into two stages of degradation localization and degraded region-guided restoration.
We demonstrate that the model trained for this auxiliary task contains vital region knowledge, which can be exploited to guide the restoration network's training.
arXiv Detail & Related papers (2022-01-01T13:10:19Z) - GLocal: Global Graph Reasoning and Local Structure Transfer for Person
Image Generation [2.580765958706854]
We focus on person image generation, namely, generating person image under various conditions, e.g., corrupted texture or different pose.
We present a GLocal framework to improve the occlusion-aware texture estimation by globally reasoning the style inter-correlations among different semantic regions.
For local structural information preservation, we further extract the local structure of the source image and regain it in the generated image via local structure transfer.
arXiv Detail & Related papers (2021-12-01T03:54:30Z) - Controllable Person Image Synthesis with Spatially-Adaptive Warped
Normalization [72.65828901909708]
Controllable person image generation aims to produce realistic human images with desirable attributes.
We introduce a novel Spatially-Adaptive Warped Normalization (SAWN), which integrates a learned flow-field to warp modulation parameters.
We propose a novel self-training part replacement strategy to refine the pretrained model for the texture-transfer task.
arXiv Detail & Related papers (2021-05-31T07:07:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.