FoundIR: Unleashing Million-scale Training Data to Advance Foundation Models for Image Restoration
- URL: http://arxiv.org/abs/2412.01427v1
- Date: Mon, 02 Dec 2024 12:08:40 GMT
- Title: FoundIR: Unleashing Million-scale Training Data to Advance Foundation Models for Image Restoration
- Authors: Hao Li, Xiang Chen, Jiangxin Dong, Jinhui Tang, Jinshan Pan,
- Abstract summary: Existing methods suffer from a generalization bottleneck in real-world scenarios.
We contribute a million-scale dataset with two notable advantages over existing training data.
We propose a robust model, FoundIR, to better address a broader range of restoration tasks in real-world scenarios.
- Score: 66.61201445650323
- License:
- Abstract: Despite the significant progress made by all-in-one models in universal image restoration, existing methods suffer from a generalization bottleneck in real-world scenarios, as they are mostly trained on small-scale synthetic datasets with limited degradations. Therefore, large-scale high-quality real-world training data is urgently needed to facilitate the emergence of foundational models for image restoration. To advance this field, we spare no effort in contributing a million-scale dataset with two notable advantages over existing training data: real-world samples with larger-scale, and degradation types with higher diversity. By adjusting internal camera settings and external imaging conditions, we can capture aligned image pairs using our well-designed data acquisition system over multiple rounds and our data alignment criterion. Moreover, we propose a robust model, FoundIR, to better address a broader range of restoration tasks in real-world scenarios, taking a further step toward foundation models. Specifically, we first utilize a diffusion-based generalist model to remove degradations by learning the degradation-agnostic common representations from diverse inputs, where incremental learning strategy is adopted to better guide model training. To refine the model's restoration capability in complex scenarios, we introduce degradation-aware specialist models for achieving final high-quality results. Extensive experiments show the value of our dataset and the effectiveness of our method.
Related papers
- UniDemoiré: Towards Universal Image Demoiréing with Data Generation and Synthesis [17.930454451440944]
Image demoir'eing poses one of the most formidable challenges in image restoration.
We propose a universal image demoir'eing solution, UniDemoir'e, which has superior generalization capability.
arXiv Detail & Related papers (2025-02-10T10:20:11Z) - Adaptive Blind All-in-One Image Restoration [15.726917603679716]
Blind all-in-one image restoration models aim to recover a high-quality image from an input degraded with unknown distortions.
These models require all the possible degradation types to be defined during the training stage while showing limited generalization to unseen degradations.
We propose a simple but effective adaptive blind all-in-one restoration model, which can address multiple degradations, generalizes well to unseen degradations, and efficiently incorporate new degradations by training a small fraction of parameters.
arXiv Detail & Related papers (2024-11-27T14:58:08Z) - DreamClear: High-Capacity Real-World Image Restoration with Privacy-Safe Dataset Curation [46.22939360256696]
We present a dual strategy: GenIR, an innovative data curation pipeline, and DreamClear, a cutting-edge Diffusion Transformer (DiT)-based image restoration model.
GenIR, our pioneering contribution, is a dual-prompt learning pipeline that overcomes the limitations of existing datasets.
DreamClear, is a DiT-based image restoration model. It utilizes the generative priors of text-to-image (T2I) diffusion models and the robust perceptual capabilities of multi-modal large language models (MLLMs) to achieve restoration.
arXiv Detail & Related papers (2024-10-24T11:57:20Z) - Towards Realistic Data Generation for Real-World Super-Resolution [58.99206459754721]
RealDGen is an unsupervised learning data generation framework designed for real-world super-resolution.
We develop content and degradation extraction strategies, which are integrated into a novel content-degradation decoupled diffusion model.
Experiments demonstrate that RealDGen excels in generating large-scale, high-quality paired data that mirrors real-world degradations.
arXiv Detail & Related papers (2024-06-11T13:34:57Z) - Fantastic Gains and Where to Find Them: On the Existence and Prospect of
General Knowledge Transfer between Any Pretrained Model [74.62272538148245]
We show that for arbitrary pairings of pretrained models, one model extracts significant data context unavailable in the other.
We investigate if it is possible to transfer such "complementary" knowledge from one model to another without performance degradation.
arXiv Detail & Related papers (2023-10-26T17:59:46Z) - Diffusion Models for Image Restoration and Enhancement -- A
Comprehensive Survey [96.99328714941657]
We present a comprehensive review of recent diffusion model-based methods on image restoration.
We classify and emphasize the innovative designs using diffusion models for both IR and blind/real-world IR.
We propose five potential and challenging directions for the future research of diffusion model-based IR.
arXiv Detail & Related papers (2023-08-18T08:40:38Z) - DINOv2: Learning Robust Visual Features without Supervision [75.42921276202522]
This work shows that existing pretraining methods, especially self-supervised methods, can produce such features if trained on enough curated data from diverse sources.
Most of the technical contributions aim at accelerating and stabilizing the training at scale.
In terms of data, we propose an automatic pipeline to build a dedicated, diverse, and curated image dataset instead of uncurated data, as typically done in the self-supervised literature.
arXiv Detail & Related papers (2023-04-14T15:12:19Z) - Real-World Image Super-Resolution by Exclusionary Dual-Learning [98.36096041099906]
Real-world image super-resolution is a practical image restoration problem that aims to obtain high-quality images from in-the-wild input.
Deep learning-based methods have achieved promising restoration quality on real-world image super-resolution datasets.
We propose Real-World image Super-Resolution by Exclusionary Dual-Learning (RWSR-EDL) to address the feature diversity in perceptual- and L1-based cooperative learning.
arXiv Detail & Related papers (2022-06-06T13:28:15Z) - Single Image Internal Distribution Measurement Using Non-Local
Variational Autoencoder [11.985083962982909]
This paper proposes a novel image-specific solution, namely non-local variational autoencoder (textttNLVAE)
textttNLVAE is introduced as a self-supervised strategy that reconstructs high-resolution images using disentangled information from the non-local neighbourhood.
Experimental results from seven benchmark datasets demonstrate the effectiveness of the textttNLVAE model.
arXiv Detail & Related papers (2022-04-02T18:43:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.