Related papers: Learning multi-scale local conditional probability models of images

Learning multi-scale local conditional probability models of images

URL: http://arxiv.org/abs/2303.02984v1
Date: Mon, 6 Mar 2023 09:23:14 GMT
Title: Learning multi-scale local conditional probability models of images
Authors: Zahra Kadkhodaie, Florentin Guth, St\'ephane Mallat, and Eero P Simoncelli
Abstract summary: Deep neural networks can learn powerful prior probability models for images, as evidenced by the high-quality generations obtained with recent score-based diffusion methods. But the means by which these networks capture complex global statistical structure, apparently without suffering from the curse of dimensionality, remain a mystery. We incorporate diffusion methods into a multi-scale decomposition, reducing dimensionality by assuming a stationary local Markov model for wavelet coefficients conditioned on coarser-scale coefficients.
Score: 7.07848787073901
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Deep neural networks can learn powerful prior probability models for images, as evidenced by the high-quality generations obtained with recent score-based diffusion methods. But the means by which these networks capture complex global statistical structure, apparently without suffering from the curse of dimensionality, remain a mystery. To study this, we incorporate diffusion methods into a multi-scale decomposition, reducing dimensionality by assuming a stationary local Markov model for wavelet coefficients conditioned on coarser-scale coefficients. We instantiate this model using convolutional neural networks (CNNs) with local receptive fields, which enforce both the stationarity and Markov properties. Global structures are captured using a CNN with receptive fields covering the entire (but small) low-pass image. We test this model on a dataset of face images, which are highly non-stationary and contain large-scale geometric structures. Remarkably, denoising, super-resolution, and image synthesis results all demonstrate that these structures can be captured with significantly smaller conditioning neighborhoods than required by a Markov model implemented in the pixel domain. Our results show that score estimation for large complex images can be reduced to low-dimensional Markov conditional models across scales, alleviating the curse of dimensionality.

Related papers

Nonparametric estimation of a factorizable density using diffusion models [3.5773675235837974]
In this paper, we study diffusion models as an implicit approach to nonparametric density estimation. We show that an implicit density estimator constructed from diffusion models adapts to the factorization structure and achieves the minimax optimal rate. In constructing the estimator, we design a sparse weight-sharing neural network architecture.
arXiv Detail & Related papers (2025-01-03T12:32:19Z)
GeoPos: A Minimal Positional Encoding for Enhanced Fine-Grained Details in Image Synthesis Using Convolutional Neural Networks [0.0]
The enduring inability of image generative models to recreate intricate geometric features has been an ongoing problem for nearly a decade. In this paper, we demonstrate how this problem can be mitigated by augmenting convolution layers geometric capabilities. We show this drastically improves quality of images generated by Diffusion Models, GANs, and Variational AutoEncoders (VAE)
arXiv Detail & Related papers (2024-01-03T19:27:20Z)
The Convex Landscape of Neural Networks: Characterizing Global Optima and Stationary Points via Lasso Models [75.33431791218302]
Deep Neural Network Network (DNN) models are used for programming purposes. In this paper we examine the use of convex neural recovery models. We show that all the stationary non-dimensional objective objective can be characterized as the standard a global subsampled convex solvers program. We also show that all the stationary non-dimensional objective objective can be characterized as the standard a global subsampled convex solvers program.
arXiv Detail & Related papers (2023-12-19T23:04:56Z)
Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components. CNNs are used to augment the local texture information of coarse priors. DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z)
Deep Networks as Denoising Algorithms: Sample-Efficient Learning of Diffusion Models in High-Dimensional Graphical Models [22.353510613540564]
We investigate the approximation efficiency of score functions by deep neural networks in generative modeling. We observe score functions can often be well-approximated in graphical models through variational inference denoising algorithms. We provide an efficient sample complexity bound for diffusion-based generative modeling when the score function is learned by deep neural networks.
arXiv Detail & Related papers (2023-09-20T15:51:10Z)
Traditional Classification Neural Networks are Good Generators: They are Competitive with DDPMs and GANs [104.72108627191041]
We show that conventional neural network classifiers can generate high-quality images comparable to state-of-the-art generative models. We propose a mask-based reconstruction module to make semantic gradients-aware to synthesize plausible images. We show that our method is also applicable to text-to-image generation by regarding image-text foundation models.
arXiv Detail & Related papers (2022-11-27T11:25:35Z)
PRANC: Pseudo RAndom Networks for Compacting deep models [22.793523211040682]
PRANC enables significant compaction of a deep model. In this study, we employ PRANC to condense image classification models and compress images by compacting their associated implicit neural networks.
arXiv Detail & Related papers (2022-06-16T22:03:35Z)
Joint Global and Local Hierarchical Priors for Learned Image Compression [30.44884350320053]
Recently, learned image compression methods have shown superior performance compared to the traditional hand-crafted image codecs. We propose a novel entropy model called Information Transformer (Informer) that exploits both local and global information in a content-dependent manner. Our experiments demonstrate that Informer improves rate-distortion performance over the state-of-the-art methods on the Kodak and Tecnick datasets.
arXiv Detail & Related papers (2021-12-08T06:17:37Z)
Anomaly Detection on Attributed Networks via Contrastive Self-Supervised Learning [50.24174211654775]
We present a novel contrastive self-supervised learning framework for anomaly detection on attributed networks. Our framework fully exploits the local information from network data by sampling a novel type of contrastive instance pair. A graph neural network-based contrastive learning model is proposed to learn informative embedding from high-dimensional attributes and local structure.
arXiv Detail & Related papers (2021-02-27T03:17:20Z)
Stochastic Segmentation Networks: Modelling Spatially Correlated Aleatoric Uncertainty [32.33791302617957]
We introduce segmentation networks (SSNs), an efficient probabilistic method for modelling aleatoric uncertainty with any image segmentation network architecture. SSNs can generate multiple spatially coherent hypotheses for a single image. We tested our method on the segmentation of real-world medical data, including lung nodules in 2D CT and brain tumours in 3D multimodal MRI scans.
arXiv Detail & Related papers (2020-06-10T18:06:41Z)
Node Embeddings and Exact Low-Rank Representations of Complex Networks [30.869784223109832]
Recent work by Seshadhri et al. suggests that such embeddings cannot capture local structure arising in complex networks. We show that the results of Seshadhri et al. are intimately connected to the model they use rather than the low-dimensional structure of complex networks.
arXiv Detail & Related papers (2020-06-10T01:09:03Z)
Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems [83.98774574197613]
We take one of the simplest inference methods, a truncated max-product Belief propagation, and add what is necessary to make it a proper component of a deep learning model. This BP-Layer can be used as the final or an intermediate block in convolutional neural networks (CNNs) The model is applicable to a range of dense prediction problems, is well-trainable and provides parameter-efficient and robust solutions in stereo, optical flow and semantic segmentation.
arXiv Detail & Related papers (2020-03-13T13:11:35Z)
Image Fine-grained Inpainting [89.17316318927621]
We present a one-stage model that utilizes dense combinations of dilated convolutions to obtain larger and more effective receptive fields. To better train this efficient generator, except for frequently-used VGG feature matching loss, we design a novel self-guided regression loss. We also employ a discriminator with local and global branches to ensure local-global contents consistency.
arXiv Detail & Related papers (2020-02-07T03:45:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.