Decomposer: Semi-supervised Learning of Image Restoration and Image
Decomposition
- URL: http://arxiv.org/abs/2311.16829v1
- Date: Tue, 28 Nov 2023 14:48:22 GMT
- Title: Decomposer: Semi-supervised Learning of Image Restoration and Image
Decomposition
- Authors: Boris Meinardus, Mariusz Trzeciakiewicz, Tim Herzig, Monika
Kwiatkowski, Simon Matern, Olaf Hellwich
- Abstract summary: We present a semi-supervised reconstruction model that decomposes distorted image sequences into their fundamental building blocks.
We use the SIDAR dataset that provides a large number of distorted image sequences.
Each distortion changes the original signal in different ways, e.g., additive or multiplicative noise.
- Score: 2.702990676892003
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present Decomposer, a semi-supervised reconstruction model that decomposes
distorted image sequences into their fundamental building blocks - the original
image and the applied augmentations, i.e., shadow, light, and occlusions. To
solve this problem, we use the SIDAR dataset that provides a large number of
distorted image sequences: each sequence contains images with shadows,
lighting, and occlusions applied to an undistorted version. Each distortion
changes the original signal in different ways, e.g., additive or multiplicative
noise. We propose a transformer-based model to explicitly learn this
decomposition. The sequential model uses 3D Swin-Transformers for
spatio-temporal encoding and 3D U-Nets as prediction heads for individual parts
of the decomposition. We demonstrate that by separately pre-training our model
on weakly supervised pseudo labels, we can steer our model to optimize for our
ambiguous problem definition and learn to differentiate between the different
image distortions.
Related papers
- Iteratively Refined Image Reconstruction with Learned Attentive Regularizers [14.93489065234423]
We propose a regularization scheme for image reconstruction that leverages the power of deep learning.
Our scheme is interpretable because it corresponds to the minimization of a series of convex problems.
We offer a promising balance between interpretability, theoretical guarantees, reliability, and performance.
arXiv Detail & Related papers (2024-07-09T07:22:48Z) - Factorized Diffusion: Perceptual Illusions by Noise Decomposition [15.977340635967018]
We present a zero-shot method to control each individual component through diffusion model sampling.
For certain decompositions, our method recovers prior approaches to compositional generation and spatial control.
We show that we can extend our approach to generate hybrid images from real images.
arXiv Detail & Related papers (2024-04-17T17:59:59Z) - 3DMiner: Discovering Shapes from Large-Scale Unannotated Image Datasets [34.610546020800236]
3DMiner is a pipeline for mining 3D shapes from challenging datasets.
Our method is capable of producing significantly better results than state-of-the-art unsupervised 3D reconstruction techniques.
We show how 3DMiner can be applied to in-the-wild data by reconstructing shapes present in images from the LAION-5B dataset.
arXiv Detail & Related papers (2023-10-29T23:08:19Z) - Improved Cryo-EM Pose Estimation and 3D Classification through Latent-Space Disentanglement [14.973360669658561]
We propose a self-supervised variational autoencoder architecture called "HetACUMN" based on amortized inference.
Results on simulated datasets show that HetACUMN generated more accurate conformational classifications than other amortized or non-amortized methods.
arXiv Detail & Related papers (2023-08-09T13:41:30Z) - Zero-1-to-3: Zero-shot One Image to 3D Object [30.455300183998247]
We introduce Zero-1-to-3, a framework for changing the camera viewpoint of an object given just a single RGB image.
Our conditional diffusion model uses a synthetic dataset to learn controls of the relative camera viewpoint.
Our method significantly outperforms state-of-the-art single-view 3D reconstruction and novel view synthesis models by leveraging Internet-scale pre-training.
arXiv Detail & Related papers (2023-03-20T17:59:50Z) - Person Image Synthesis via Denoising Diffusion Model [116.34633988927429]
We show how denoising diffusion models can be applied for high-fidelity person image synthesis.
Our results on two large-scale benchmarks and a user study demonstrate the photorealism of our proposed approach under challenging scenarios.
arXiv Detail & Related papers (2022-11-22T18:59:50Z) - Invertible Rescaling Network and Its Extensions [118.72015270085535]
In this work, we propose a novel invertible framework to model the bidirectional degradation and restoration from a new perspective.
We develop invertible models to generate valid degraded images and transform the distribution of lost contents.
Then restoration is made tractable by applying the inverse transformation on the generated degraded image together with a randomly-drawn latent variable.
arXiv Detail & Related papers (2022-10-09T06:58:58Z) - NeuralReshaper: Single-image Human-body Retouching with Deep Neural
Networks [50.40798258968408]
We present NeuralReshaper, a novel method for semantic reshaping of human bodies in single images using deep generative networks.
Our approach follows a fit-then-reshape pipeline, which first fits a parametric 3D human model to a source human image.
To deal with the lack-of-data problem that no paired data exist, we introduce a novel self-supervised strategy to train our network.
arXiv Detail & Related papers (2022-03-20T09:02:13Z) - Differentiable Rendering with Perturbed Optimizers [85.66675707599782]
Reasoning about 3D scenes from their 2D image projections is one of the core problems in computer vision.
Our work highlights the link between some well-known differentiable formulations and randomly smoothed renderings.
We apply our method to 3D scene reconstruction and demonstrate its advantages on the tasks of 6D pose estimation and 3D mesh reconstruction.
arXiv Detail & Related papers (2021-10-18T08:56:23Z) - Shelf-Supervised Mesh Prediction in the Wild [54.01373263260449]
We propose a learning-based approach to infer 3D shape and pose of object from a single image.
We first infer a volumetric representation in a canonical frame, along with the camera pose.
The coarse volumetric prediction is then converted to a mesh-based representation, which is further refined in the predicted camera frame.
arXiv Detail & Related papers (2021-02-11T18:57:10Z) - Image GANs meet Differentiable Rendering for Inverse Graphics and
Interpretable 3D Neural Rendering [101.56891506498755]
Differentiable rendering has paved the way to training neural networks to perform "inverse graphics" tasks.
We show that our approach significantly outperforms state-of-the-art inverse graphics networks trained on existing datasets.
arXiv Detail & Related papers (2020-10-18T22:29:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.