Hybrid Agents for Image Restoration
- URL: http://arxiv.org/abs/2503.10120v1
- Date: Thu, 13 Mar 2025 07:28:33 GMT
- Title: Hybrid Agents for Image Restoration
- Authors: Bingchen Li, Xin Li, Yiting Lu, Zhibo Chen,
- Abstract summary: We present HybridAgent, intending to incorporate multiple restoration modes into a unified image restoration model.<n>Fast restoration agent is designed based on a lightweight large language model (LLM) via in-context learning to understand the user prompts.<n>We introduce the mixed distortion removal mode for our HybridAgents, which is crucial but not concerned in previous agent-based works.
- Score: 16.534263448775103
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing Image Restoration (IR) studies typically focus on task-specific or universal modes individually, relying on the mode selection of users and lacking the cooperation between multiple task-specific/universal restoration modes. This leads to insufficient interaction for unprofessional users and limits their restoration capability for complicated real-world applications. In this work, we present HybridAgent, intending to incorporate multiple restoration modes into a unified image restoration model and achieve intelligent and efficient user interaction through our proposed hybrid agents. Concretely, we propose the hybrid rule of fast, slow, and feedback restoration agents. Here, the slow restoration agent optimizes the powerful multimodal large language model (MLLM) with our proposed instruction-tuning dataset to identify degradations within images with ambiguous user prompts and invokes proper restoration tools accordingly. The fast restoration agent is designed based on a lightweight large language model (LLM) via in-context learning to understand the user prompts with simple and clear requirements, which can obviate the unnecessary time/resource costs of MLLM. Moreover, we introduce the mixed distortion removal mode for our HybridAgents, which is crucial but not concerned in previous agent-based works. It can effectively prevent the error propagation of step-by-step image restoration and largely improve the efficiency of the agent system. We validate the effectiveness of HybridAgent with both synthetic and real-world IR tasks.
Related papers
- Beyond Degradation Redundancy: Contrastive Prompt Learning for All-in-One Image Restoration [109.38288333994407]
Contrastive Prompt Learning (CPL) is a novel framework that fundamentally enhances prompt-task alignment.
Our framework establishes new state-of-the-art performance while maintaining parameter efficiency, offering a principled solution for unified image restoration.
arXiv Detail & Related papers (2025-04-14T08:24:57Z) - Multi-Agent Image Restoration [9.614197636859435]
We propose MAIR, a novel Multi-Agent approach for complex IR problems.<n>Built upon a three-stage restoration framework, MAIR emulates a team of collaborative human specialists.<n>MAIR achieves competitive performance and improved efficiency over the previous agentic IR system.
arXiv Detail & Related papers (2025-03-12T13:53:57Z) - UniRestore: Unified Perceptual and Task-Oriented Image Restoration Model Using Diffusion Prior [56.35236964617809]
Image restoration aims to recover content from inputs degraded by various factors, such as adverse weather, blur, and noise.<n>This paper introduces UniRestore, a unified image restoration model that bridges the gap between PIR and TIR.<n>We propose a Complementary Feature Restoration Module (CFRM) to reconstruct degraded encoder features and a Task Feature Adapter (TFA) module to facilitate adaptive feature fusion in the decoder.
arXiv Detail & Related papers (2025-01-22T08:06:48Z) - An Intelligent Agentic System for Complex Image Restoration Problems [39.93819777300997]
AgenticIR mimics the human approach to image processing by following five key stages: Perception, Scheduling, Execution, Reflection, and Rescheduling.<n>We employ large language models (LLMs) and vision-language models (VLMs) that interact via text generation to operate a toolbox of IR models.<n> Experiments demonstrate AgenticIR's potential in handling complex IR tasks, representing a promising path toward achieving general intelligence in visual processing.
arXiv Detail & Related papers (2024-10-23T12:11:26Z) - LoRA-IR: Taming Low-Rank Experts for Efficient All-in-One Image Restoration [62.3751291442432]
We propose LoRA-IR, a flexible framework that dynamically leverages compact low-rank experts to facilitate efficient all-in-one image restoration.
LoRA-IR consists of two training stages: degradation-guided pre-training and parameter-efficient fine-tuning.
Experiments demonstrate that LoRA-IR achieves SOTA performance across 14 IR tasks and 29 benchmarks, while maintaining computational efficiency.
arXiv Detail & Related papers (2024-10-20T13:00:24Z) - UIR-LoRA: Achieving Universal Image Restoration through Multiple Low-Rank Adaptation [50.27688690379488]
Existing unified methods treat multi-degradation image restoration as a multi-task learning problem.
We propose a universal image restoration framework based on multiple low-rank adapters (LoRA) from multi-domain transfer learning.
Our framework leverages the pre-trained generative model as the shared component for multi-degradation restoration and transfers it to specific degradation image restoration tasks.
arXiv Detail & Related papers (2024-09-30T11:16:56Z) - RestoreAgent: Autonomous Image Restoration Agent via Multimodal Large Language Models [45.88103575837924]
We introduce RestoreAgent, an intelligent image restoration system leveraging multimodal large language models.
RestoreAgent autonomously assesses the type and extent of degradation in input images and performs restoration through (1) determining the appropriate restoration tasks, (2) optimizing the task sequence, (3) selecting the most suitable models, and (4) executing the restoration.
Experimental results demonstrate the superior performance of RestoreAgent in handling complex degradation, surpassing human experts.
arXiv Detail & Related papers (2024-07-25T13:29:37Z) - Restorer: Removing Multi-Degradation with All-Axis Attention and Prompt Guidance [12.066756224383827]
textbfRestorer is a novel Transformer-based all-in-one image restoration model.
It can handle composite degradation in real-world scenarios without requiring additional training.
It is efficient during inference, suggesting the potential in real-world applications.
arXiv Detail & Related papers (2024-06-18T13:18:32Z) - Unified-Width Adaptive Dynamic Network for All-In-One Image Restoration [50.81374327480445]
We introduce a novel concept positing that intricate image degradation can be represented in terms of elementary degradation.
We propose the Unified-Width Adaptive Dynamic Network (U-WADN), consisting of two pivotal components: a Width Adaptive Backbone (WAB) and a Width Selector (WS)
The proposed U-WADN achieves better performance while simultaneously reducing up to 32.3% of FLOPs and providing approximately 15.7% real-time acceleration.
arXiv Detail & Related papers (2024-01-24T04:25:12Z) - Multimodal Prompt Perceiver: Empower Adaptiveness, Generalizability and Fidelity for All-in-One Image Restoration [58.11518043688793]
MPerceiver is a novel approach to enhance adaptiveness, generalizability and fidelity for all-in-one image restoration.
MPerceiver is trained on 9 tasks for all-in-one IR and outperforms state-of-the-art task-specific methods across most tasks.
arXiv Detail & Related papers (2023-12-05T17:47:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.