Multi-modal Datasets for Super-resolution
- URL: http://arxiv.org/abs/2004.05804v1
- Date: Mon, 13 Apr 2020 07:39:52 GMT
- Title: Multi-modal Datasets for Super-resolution
- Authors: Haoran Li, Weihong Quan, Meijun Yan, Jin zhang, Xiaoli Gong and Jin
Zhou
- Abstract summary: We propose real-world black-and-white old photo datasets for super-resolution (OID-RW)
The dataset contains 82 groups of images, including 22 groups of character type and 60 groups of landscape and architecture.
We also propose a multi-modal degradation dataset (MDD400) to solve the super-resolution reconstruction in real-life image degradation scenarios.
- Score: 12.079245552387361
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Nowdays, most datasets used to train and evaluate super-resolution models are
single-modal simulation datasets. However, due to the variety of image
degradation types in the real world, models trained on single-modal simulation
datasets do not always have good robustness and generalization ability in
different degradation scenarios. Previous work tended to focus only on
true-color images. In contrast, we first proposed real-world black-and-white
old photo datasets for super-resolution (OID-RW), which is constructed using
two methods of manually filling pixels and shooting with different cameras. The
dataset contains 82 groups of images, including 22 groups of character type and
60 groups of landscape and architecture. At the same time, we also propose a
multi-modal degradation dataset (MDD400) to solve the super-resolution
reconstruction in real-life image degradation scenarios. We managed to simulate
the process of generating degraded images by the following four methods:
interpolation algorithm, CNN network, GAN network and capturing videos with
different bit rates. Our experiments demonstrate that not only the models
trained on our dataset have better generalization capability and robustness,
but also the trained images can maintain better edge contours and texture
features.
Related papers
- FoundIR: Unleashing Million-scale Training Data to Advance Foundation Models for Image Restoration [66.61201445650323]
Existing methods suffer from a generalization bottleneck in real-world scenarios.
We contribute a million-scale dataset with two notable advantages over existing training data.
We propose a robust model, FoundIR, to better address a broader range of restoration tasks in real-world scenarios.
arXiv Detail & Related papers (2024-12-02T12:08:40Z) - Community Forensics: Using Thousands of Generators to Train Fake Image Detectors [15.166026536032142]
One of the key challenges of detecting AI-generated images is spotting images that have been created by previously unseen generative models.
We propose a new dataset that is significantly larger and more diverse than prior work.
The resulting dataset contains 2.7M images that have been sampled from 4803 different models.
arXiv Detail & Related papers (2024-11-06T18:59:41Z) - DreamClear: High-Capacity Real-World Image Restoration with Privacy-Safe Dataset Curation [46.22939360256696]
We present a dual strategy: GenIR, an innovative data curation pipeline, and DreamClear, a cutting-edge Diffusion Transformer (DiT)-based image restoration model.
GenIR, our pioneering contribution, is a dual-prompt learning pipeline that overcomes the limitations of existing datasets.
DreamClear, is a DiT-based image restoration model. It utilizes the generative priors of text-to-image (T2I) diffusion models and the robust perceptual capabilities of multi-modal large language models (MLLMs) to achieve restoration.
arXiv Detail & Related papers (2024-10-24T11:57:20Z) - PixelBytes: Catching Unified Representation for Multimodal Generation [0.0]
PixelBytes is an approach for unified multimodal representation learning.
We explore integrating text, audio, action-state, and pixelated images (sprites) into a cohesive representation.
We conducted experiments on a PixelBytes Pokemon dataset and an Optimal-Control dataset.
arXiv Detail & Related papers (2024-09-16T09:20:13Z) - Exposure Bracketing is All You Need for Unifying Image Restoration and Enhancement Tasks [50.822601495422916]
We propose to utilize exposure bracketing photography to unify image restoration and enhancement tasks.
Due to the difficulty in collecting real-world pairs, we suggest a solution that first pre-trains the model with synthetic paired data.
In particular, a temporally modulated recurrent network (TMRNet) and self-supervised adaptation method are proposed.
arXiv Detail & Related papers (2024-01-01T14:14:35Z) - Towards Real-World Burst Image Super-Resolution: Benchmark and Method [93.73429028287038]
In this paper, we establish a large-scale real-world burst super-resolution dataset, i.e., RealBSR, to explore the faithful reconstruction of image details from multiple frames.
We also introduce a Federated Burst Affinity network (FBAnet) to investigate non-trivial pixel-wise displacement among images under real-world image degradation.
arXiv Detail & Related papers (2023-09-09T14:11:37Z) - Learning Enriched Features for Fast Image Restoration and Enhancement [166.17296369600774]
This paper presents a holistic goal of maintaining spatially-precise high-resolution representations through the entire network.
We learn an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
Our approach achieves state-of-the-art results for a variety of image processing tasks, including defocus deblurring, image denoising, super-resolution, and image enhancement.
arXiv Detail & Related papers (2022-04-19T17:59:45Z) - Any-resolution Training for High-resolution Image Synthesis [55.19874755679901]
Generative models operate at fixed resolution, even though natural images come in a variety of sizes.
We argue that every pixel matters and create datasets with variable-size images, collected at their native resolutions.
We introduce continuous-scale training, a process that samples patches at random scales to train a new generator with variable output resolutions.
arXiv Detail & Related papers (2022-04-14T17:59:31Z) - Exploiting Raw Images for Real-Scene Super-Resolution [105.18021110372133]
We study the problem of real-scene single image super-resolution to bridge the gap between synthetic data and real captured images.
We propose a method to generate more realistic training data by mimicking the imaging process of digital cameras.
We also develop a two-branch convolutional neural network to exploit the radiance information originally-recorded in raw images.
arXiv Detail & Related papers (2021-02-02T16:10:15Z) - Bayesian Image Reconstruction using Deep Generative Models [7.012708932320081]
In this work, we leverage state-of-the-art (SOTA) generative models for building powerful image priors.
Our method, called Bayesian Reconstruction through Generative Models (BRGM), uses a single pre-trained generator model to solve different image restoration tasks.
arXiv Detail & Related papers (2020-12-08T17:11:26Z) - Learning Enriched Features for Real Image Restoration and Enhancement [166.17296369600774]
convolutional neural networks (CNNs) have achieved dramatic improvements over conventional approaches for image restoration task.
We present a novel architecture with the collective goals of maintaining spatially-precise high-resolution representations through the entire network.
Our approach learns an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
arXiv Detail & Related papers (2020-03-15T11:04:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.