U-Nets as Belief Propagation: Efficient Classification, Denoising, and Diffusion in Generative Hierarchical Models
- URL: http://arxiv.org/abs/2404.18444v2
- Date: Wed, 1 May 2024 16:49:57 GMT
- Title: U-Nets as Belief Propagation: Efficient Classification, Denoising, and Diffusion in Generative Hierarchical Models
- Authors: Song Mei,
- Abstract summary: We study certain generative hierarchical models, which are tree-structured graphical models extensively utilized in both language and image domains.
With their encoder-decoder structure, long skip connections, and pooling and up-sampling layers, we demonstrate how U-Nets can naturally implement the belief propagation denoising algorithm.
We also demonstrate that the conventional architecture of convolutional neural networks (ConvNets) is ideally suited for classification tasks within these models.
- Score: 21.212179660694098
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: U-Nets are among the most widely used architectures in computer vision, renowned for their exceptional performance in applications such as image segmentation, denoising, and diffusion modeling. However, a theoretical explanation of the U-Net architecture design has not yet been fully established. This paper introduces a novel interpretation of the U-Net architecture by studying certain generative hierarchical models, which are tree-structured graphical models extensively utilized in both language and image domains. With their encoder-decoder structure, long skip connections, and pooling and up-sampling layers, we demonstrate how U-Nets can naturally implement the belief propagation denoising algorithm in such generative hierarchical models, thereby efficiently approximating the denoising functions. This leads to an efficient sample complexity bound for learning the denoising function using U-Nets within these models. Additionally, we discuss the broader implications of these findings for diffusion models in generative hierarchical models. We also demonstrate that the conventional architecture of convolutional neural networks (ConvNets) is ideally suited for classification tasks within these models. This offers a unified view of the roles of ConvNets and U-Nets, highlighting the versatility of generative hierarchical models in modeling complex data distributions across language and image domains.
Related papers
- SENetV2: Aggregated dense layer for channelwise and global
representations [0.0]
We introduce a novel aggregated multilayer perceptron, a multi-branch dense layer, within the Squeeze residual module.
This fusion enhances the network's ability to capture channel-wise patterns and have global knowledge.
We conduct extensive experiments on benchmark datasets to validate the model and compare them with established architectures.
arXiv Detail & Related papers (2023-11-17T14:10:57Z) - Deep Networks as Denoising Algorithms: Sample-Efficient Learning of
Diffusion Models in High-Dimensional Graphical Models [22.353510613540564]
We investigate the approximation efficiency of score functions by deep neural networks in generative modeling.
We observe score functions can often be well-approximated in graphical models through variational inference denoising algorithms.
We provide an efficient sample complexity bound for diffusion-based generative modeling when the score function is learned by deep neural networks.
arXiv Detail & Related papers (2023-09-20T15:51:10Z) - A Unified Framework for U-Net Design and Analysis [32.48954087522557]
We provide a framework for designing and analysing general U-Net architectures.
We propose Multi-ResNets, U-Nets with a simplified, wavelet-based encoder without learnable parameters.
In diffusion models, our framework enables us to identify that high-frequency information is dominated by noise exponentially faster.
arXiv Detail & Related papers (2023-05-31T08:07:44Z) - Language Models are General-Purpose Interfaces [109.45478241369655]
We propose to use language models as a general-purpose interface to various foundation models.
A collection of pretrained encoders perceive diverse modalities (such as vision, and language)
We propose a semi-causal language modeling objective to jointly pretrain the interface and the modular encoders.
arXiv Detail & Related papers (2022-06-13T17:34:22Z) - Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules.
inputs to the model are routed through a sequence of functions in a way that is end-to-end learned.
We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z) - Sparse Flows: Pruning Continuous-depth Models [107.98191032466544]
We show that pruning improves generalization for neural ODEs in generative modeling.
We also show that pruning finds minimal and efficient neural ODE representations with up to 98% less parameters compared to the original network, without loss of accuracy.
arXiv Detail & Related papers (2021-06-24T01:40:17Z) - Polynomial Networks in Deep Classifiers [55.90321402256631]
We cast the study of deep neural networks under a unifying framework.
Our framework provides insights on the inductive biases of each model.
The efficacy of the proposed models is evaluated on standard image and audio classification benchmarks.
arXiv Detail & Related papers (2021-04-16T06:41:20Z) - Spatial Dependency Networks: Neural Layers for Improved Generative Image
Modeling [79.15521784128102]
We introduce a novel neural network for building image generators (decoders) and apply it to variational autoencoders (VAEs)
In our spatial dependency networks (SDNs), feature maps at each level of a deep neural net are computed in a spatially coherent way.
We show that augmenting the decoder of a hierarchical VAE by spatial dependency layers considerably improves density estimation.
arXiv Detail & Related papers (2021-03-16T07:01:08Z) - CDLNet: Robust and Interpretable Denoising Through Deep Convolutional
Dictionary Learning [6.6234935958112295]
Unrolled optimization networks propose an interpretable alternative to constructing deep neural networks.
We show that the proposed model outperforms the state-of-the-art denoising models when scaled to similar parameter count.
arXiv Detail & Related papers (2021-03-05T01:15:59Z) - Obtaining Faithful Interpretations from Compositional Neural Networks [72.41100663462191]
We evaluate the intermediate outputs of NMNs on NLVR2 and DROP datasets.
We find that the intermediate outputs differ from the expected output, illustrating that the network structure does not provide a faithful explanation of model behaviour.
arXiv Detail & Related papers (2020-05-02T06:50:35Z) - Spectrally Consistent UNet for High Fidelity Image Transformations [5.494315657902533]
Convolutional Neural Networks (CNNs) are the current de-facto models used for many imaging tasks.
In this work, a method for assessing the structural biases of UNets and the effects these have on the outputs is presented.
A new upsampling module is proposed, based on a novel use of the Guided Image Filter, that provides spectrally consistent outputs.
arXiv Detail & Related papers (2020-04-22T17:04:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.