Bayesian Image Reconstruction using Deep Generative Models
- URL: http://arxiv.org/abs/2012.04567v3
- Date: Sun, 21 Feb 2021 21:44:29 GMT
- Title: Bayesian Image Reconstruction using Deep Generative Models
- Authors: Razvan V Marinescu, Daniel Moyer, Polina Golland
- Abstract summary: In this work, we leverage state-of-the-art (SOTA) generative models for building powerful image priors.
Our method, called Bayesian Reconstruction through Generative Models (BRGM), uses a single pre-trained generator model to solve different image restoration tasks.
- Score: 7.012708932320081
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine learning models are commonly trained end-to-end and in a supervised
setting, using paired (input, output) data. Classical examples include recent
super-resolution methods that train on pairs of (low-resolution,
high-resolution) images. However, these end-to-end approaches require
re-training every time there is a distribution shift in the inputs (e.g., night
images vs daylight) or relevant latent variables (e.g., camera blur or hand
motion). In this work, we leverage state-of-the-art (SOTA) generative models
(here StyleGAN2) for building powerful image priors, which enable application
of Bayes' theorem for many downstream reconstruction tasks. Our method, called
Bayesian Reconstruction through Generative Models (BRGM), uses a single
pre-trained generator model to solve different image restoration tasks, i.e.,
super-resolution and in-painting, by combining it with different forward
corruption models. We demonstrate BRGM on three large, yet diverse, datasets
that enable us to build powerful priors: (i) 60,000 images from the Flick Faces
High Quality dataset (ii) 240,000 chest X-rays from MIMIC III and (iii) a
combined collection of 5 brain MRI datasets with 7,329 scans. Across all three
datasets and without any dataset-specific hyperparameter tuning, our approach
yields state-of-the-art performance on super-resolution, particularly at
low-resolution levels, as well as inpainting, compared to state-of-the-art
methods that are specific to each reconstruction task. Our source code and all
pre-trained models are available online:
https://razvanmarinescu.github.io/brgm/.
Related papers
- EnsIR: An Ensemble Algorithm for Image Restoration via Gaussian Mixture Models [70.60381055741391]
Image restoration challenges related to illposed problems, resulting in deviations between single model predictions and ground-truths.
Ensemble learning aims to address these deviations by combining the predictions of multiple base models.
We employ an expectation (EM)-based algorithm to estimate ensemble weights for prediction candidates.
Our algorithm is model-agnostic and training-free, allowing seamless integration and enhancement of various pre-trained image restoration models.
arXiv Detail & Related papers (2024-10-30T12:16:35Z) - SCube: Instant Large-Scale Scene Reconstruction using VoxSplats [55.383993296042526]
We present SCube, a novel method for reconstructing large-scale 3D scenes (geometry, appearance, and semantics) from a sparse set of posed images.
Our method encodes reconstructed scenes using a novel representation VoxSplat, which is a set of 3D Gaussians supported on a high-resolution sparse-voxel scaffold.
arXiv Detail & Related papers (2024-10-26T00:52:46Z) - MVGamba: Unify 3D Content Generation as State Space Sequence Modeling [150.80564081817786]
We introduce MVGamba, a general and lightweight Gaussian reconstruction model featuring a multi-view Gaussian reconstructor.
With off-the-detail multi-view diffusion models integrated, MVGamba unifies 3D generation tasks from a single image, sparse images, or text prompts.
Experiments demonstrate that MVGamba outperforms state-of-the-art baselines in all 3D content generation scenarios with approximately only $0.1times$ of the model size.
arXiv Detail & Related papers (2024-06-10T15:26:48Z) - Hyper-VolTran: Fast and Generalizable One-Shot Image to 3D Object
Structure via HyperNetworks [53.67497327319569]
We introduce a novel neural rendering technique to solve image-to-3D from a single view.
Our approach employs the signed distance function as the surface representation and incorporates generalizable priors through geometry-encoding volumes and HyperNetworks.
Our experiments show the advantages of our proposed approach with consistent results and rapid generation.
arXiv Detail & Related papers (2023-12-24T08:42:37Z) - Designing BERT for Convolutional Networks: Sparse and Hierarchical
Masked Modeling [23.164631160130092]
We extend the success of BERT-style pre-training, or the masked image modeling, to convolutional networks (convnets)
We treat unmasked pixels as sparse voxels of 3D point clouds and use sparse convolution to encode.
This is the first use of sparse convolution for 2D masked modeling.
arXiv Detail & Related papers (2023-01-09T18:59:50Z) - InvGAN: Invertible GANs [88.58338626299837]
InvGAN, short for Invertible GAN, successfully embeds real images to the latent space of a high quality generative model.
This allows us to perform image inpainting, merging, and online data augmentation.
arXiv Detail & Related papers (2021-12-08T21:39:00Z) - Locally Masked Convolution for Autoregressive Models [107.4635841204146]
LMConv is a simple modification to the standard 2D convolution that allows arbitrary masks to be applied to the weights at each location in the image.
We learn an ensemble of distribution estimators that share parameters but differ in generation order, achieving improved performance on whole-image density estimation.
arXiv Detail & Related papers (2020-06-22T17:59:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.