SIDAR: Synthetic Image Dataset for Alignment & Restoration
- URL: http://arxiv.org/abs/2305.12036v1
- Date: Fri, 19 May 2023 23:32:06 GMT
- Title: SIDAR: Synthetic Image Dataset for Alignment & Restoration
- Authors: Monika Kwiatkowski, Simon Matern, Olaf Hellwich
- Abstract summary: There is a lack of datasets that provide enough data to train and evaluate end-to-end deep learning models.
Our proposed data augmentation helps to overcome the issue of data scarcity by using 3D rendering.
The resulting dataset can serve as a training and evaluation set for a multitude of tasks involving image alignment and artifact removal.
- Score: 2.9649783577150837
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image alignment and image restoration are classical computer vision tasks.
However, there is still a lack of datasets that provide enough data to train
and evaluate end-to-end deep learning models. Obtaining ground-truth data for
image alignment requires sophisticated structure-from-motion methods or optical
flow systems that often do not provide enough data variance, i.e., typically
providing a high number of image correspondences, while only introducing few
changes of scenery within the underlying image sequences. Alternative
approaches utilize random perspective distortions on existing image data.
However, this only provides trivial distortions, lacking the complexity and
variance of real-world scenarios. Instead, our proposed data augmentation helps
to overcome the issue of data scarcity by using 3D rendering: images are added
as textures onto a plane, then varying lighting conditions, shadows, and
occlusions are added to the scene. The scene is rendered from multiple
viewpoints, generating perspective distortions more consistent with real-world
scenarios, with homographies closely resembling those of camera projections
rather than randomized homographies. For each scene, we provide a sequence of
distorted images with corresponding occlusion masks, homographies, and
ground-truth labels. The resulting dataset can serve as a training and
evaluation set for a multitude of tasks involving image alignment and artifact
removal, such as deep homography estimation, dense image matching, 2D bundle
adjustment, inpainting, shadow removal, denoising, content retrieval, and
background subtraction. Our data generation pipeline is customizable and can be
applied to any existing dataset, serving as a data augmentation to further
improve the feature learning of any existing method.
Related papers
- MegaScenes: Scene-Level View Synthesis at Scale [69.21293001231993]
Scene-level novel view synthesis (NVS) is fundamental to many vision and graphics applications.
We create a large-scale scene-level dataset from Internet photo collections, called MegaScenes, which contains over 100K structure from motion (SfM) reconstructions from around the world.
We analyze failure cases of state-of-the-art NVS methods and significantly improve generation consistency.
arXiv Detail & Related papers (2024-06-17T17:55:55Z) - Deep Image Composition Meets Image Forgery [0.0]
Image forgery has been studied for many years.
Deep learning models require large amounts of labeled data for training.
We use state of the art image composition deep learning models to generate spliced images close to the quality of real-life manipulations.
arXiv Detail & Related papers (2024-04-03T17:54:37Z) - An evaluation of Deep Learning based stereo dense matching dataset shift
from aerial images and a large scale stereo dataset [2.048226951354646]
We present a method for generating ground-truth disparity maps directly from Light Detection and Ranging (LiDAR) and images.
We evaluate 11 dense matching methods across datasets with diverse scene types, image resolutions, and geometric configurations.
arXiv Detail & Related papers (2024-02-19T20:33:46Z) - Exposure Bracketing is All You Need for Unifying Image Restoration and Enhancement Tasks [50.822601495422916]
We propose to utilize exposure bracketing photography to unify image restoration and enhancement tasks.
Due to the difficulty in collecting real-world pairs, we suggest a solution that first pre-trains the model with synthetic paired data.
In particular, a temporally modulated recurrent network (TMRNet) and self-supervised adaptation method are proposed.
arXiv Detail & Related papers (2024-01-01T14:14:35Z) - DIAR: Deep Image Alignment and Reconstruction using Swin Transformers [3.1000291317724993]
We create a dataset that contains images with image distortions.
We create perspective distortions with corresponding ground-truth homographies as labels.
We use our dataset to train Swin transformer models to analyze sequential image data.
arXiv Detail & Related papers (2023-10-17T21:59:45Z) - iEdit: Localised Text-guided Image Editing with Weak Supervision [53.082196061014734]
We propose a novel learning method for text-guided image editing.
It generates images conditioned on a source image and a textual edit prompt.
It shows favourable results against its counterparts in terms of image fidelity, CLIP alignment score and qualitatively for editing both generated and real images.
arXiv Detail & Related papers (2023-05-10T07:39:14Z) - Diffusion-Based Scene Graph to Image Generation with Masked Contrastive
Pre-Training [112.94542676251133]
We propose to learn scene graph embeddings by directly optimizing their alignment with images.
Specifically, we pre-train an encoder to extract both global and local information from scene graphs.
The resulting method, called SGDiff, allows for the semantic manipulation of generated images by modifying scene graph nodes and connections.
arXiv Detail & Related papers (2022-11-21T01:11:19Z) - Enhancing Low-Light Images in Real World via Cross-Image Disentanglement [58.754943762945864]
We propose a new low-light image enhancement dataset consisting of misaligned training images with real-world corruptions.
Our model achieves state-of-the-art performances on both the newly proposed dataset and other popular low-light datasets.
arXiv Detail & Related papers (2022-01-10T03:12:52Z) - Intrinsic Autoencoders for Joint Neural Rendering and Intrinsic Image
Decomposition [67.9464567157846]
We propose an autoencoder for joint generation of realistic images from synthetic 3D models while simultaneously decomposing real images into their intrinsic shape and appearance properties.
Our experiments confirm that a joint treatment of rendering and decomposition is indeed beneficial and that our approach outperforms state-of-the-art image-to-image translation baselines both qualitatively and quantitatively.
arXiv Detail & Related papers (2020-06-29T12:53:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.