Neural Residual Diffusion Models for Deep Scalable Vision Generation
- URL: http://arxiv.org/abs/2406.13215v2
- Date: Sun, 21 Jul 2024 15:55:34 GMT
- Title: Neural Residual Diffusion Models for Deep Scalable Vision Generation
- Authors: Zhiyuan Ma, Liangliang Zhao, Biqing Qi, Bowen Zhou,
- Abstract summary: We propose a unified and massively scalable Neural Residual Diffusion Models framework (Neural-RDM)
The proposed neural residual models obtain state-of-the-art scores on image's and video's generative benchmarks.
- Score: 17.931568104324985
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The most advanced diffusion models have recently adopted increasingly deep stacked networks (e.g., U-Net or Transformer) to promote the generative emergence capabilities of vision generation models similar to large language models (LLMs). However, progressively deeper stacked networks will intuitively cause numerical propagation errors and reduce noisy prediction capabilities on generative data, which hinders massively deep scalable training of vision generation models. In this paper, we first uncover the nature that neural networks being able to effectively perform generative denoising lies in the fact that the intrinsic residual unit has consistent dynamic property with the input signal's reverse diffusion process, thus supporting excellent generative abilities. Afterwards, we stand on the shoulders of two common types of deep stacked networks to propose a unified and massively scalable Neural Residual Diffusion Models framework (Neural-RDM for short), which is a simple yet meaningful change to the common architecture of deep generative networks by introducing a series of learnable gated residual parameters that conform to the generative dynamics. Experimental results on various generative tasks show that the proposed neural residual models obtain state-of-the-art scores on image's and video's generative benchmarks. Rigorous theoretical proofs and extensive experiments also demonstrate the advantages of this simple gated residual mechanism consistent with dynamic modeling in improving the fidelity and consistency of generated content and supporting large-scale scalable training. Code is available at https://github.com/Anonymous/Neural-RDM.
Related papers
- U-KAN Makes Strong Backbone for Medical Image Segmentation and Generation [48.40120035775506]
Kolmogorov-Arnold Networks (KANs) reshape the neural network learning via the stack of non-linear learnable activation functions.
We investigate, modify and re-design the established U-Net pipeline by integrating the dedicated KAN layers on the tokenized intermediate representation, termed U-KAN.
We further delved into the potential of U-KAN as an alternative U-Net noise predictor in diffusion models, demonstrating its applicability in generating task-oriented model architectures.
arXiv Detail & Related papers (2024-06-05T04:13:03Z) - Diffusion-Based Neural Network Weights Generation [80.89706112736353]
D2NWG is a diffusion-based neural network weights generation technique that efficiently produces high-performing weights for transfer learning.
Our method extends generative hyper-representation learning to recast the latent diffusion paradigm for neural network weights generation.
Our approach is scalable to large architectures such as large language models (LLMs), overcoming the limitations of current parameter generation techniques.
arXiv Detail & Related papers (2024-02-28T08:34:23Z) - Neural Network Parameter Diffusion [50.85251415173792]
Diffusion models have achieved remarkable success in image and video generation.
In this work, we demonstrate that diffusion models can also.
generate high-performing neural network parameters.
arXiv Detail & Related papers (2024-02-20T16:59:03Z) - DDMI: Domain-Agnostic Latent Diffusion Models for Synthesizing High-Quality Implicit Neural Representations [13.357094648241839]
Domain-agnostic Latent Diffusion Model for INRs generates adaptive positional embeddings instead of neural networks' weights.
We develop a decomposed-to-continuous space Variational AutoEncoder (D2C-VAE), which seamlessly connects discrete data and the continuous signal functions.
Experiments across four modalities, e.g., 2D images, 3D shapes, Neural Radiance Fields, and videos, with seven benchmark datasets, demonstrate the versatility of DDMI.
arXiv Detail & Related papers (2024-01-23T06:21:34Z) - Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective [64.04617968947697]
We introduce a novel data-model co-design perspective: to promote superior weight sparsity.
Specifically, customized Visual Prompts are mounted to upgrade neural Network sparsification in our proposed VPNs framework.
arXiv Detail & Related papers (2023-12-03T13:50:24Z) - Do We Need an Encoder-Decoder to Model Dynamical Systems on Networks? [18.92828441607381]
We show that embeddings induce a model that fits observations well but simultaneously has incorrect dynamical behaviours.
We propose a simple embedding-free alternative based on parametrising two additive vector-field components.
arXiv Detail & Related papers (2023-05-20T12:41:47Z) - EINNs: Epidemiologically-Informed Neural Networks [75.34199997857341]
We introduce a new class of physics-informed neural networks-EINN-crafted for epidemic forecasting.
We investigate how to leverage both the theoretical flexibility provided by mechanistic models as well as the data-driven expressability afforded by AI models.
arXiv Detail & Related papers (2022-02-21T18:59:03Z) - Sparse Flows: Pruning Continuous-depth Models [107.98191032466544]
We show that pruning improves generalization for neural ODEs in generative modeling.
We also show that pruning finds minimal and efficient neural ODE representations with up to 98% less parameters compared to the original network, without loss of accuracy.
arXiv Detail & Related papers (2021-06-24T01:40:17Z) - Stochastic Recurrent Neural Network for Multistep Time Series
Forecasting [0.0]
We leverage advances in deep generative models and the concept of state space models to propose an adaptation of the recurrent neural network for time series forecasting.
Our model preserves the architectural workings of a recurrent neural network for which all relevant information is encapsulated in its hidden states, and this flexibility allows our model to be easily integrated into any deep architecture for sequential modelling.
arXiv Detail & Related papers (2021-04-26T01:43:43Z) - Sobolev training of thermodynamic-informed neural networks for smoothed
elasto-plasticity models with level set hardening [0.0]
We introduce a deep learning framework designed to train smoothed elastoplasticity models with interpretable components.
By recasting the yield function as an evolving level set, we introduce a machine learning approach to predict the solutions of the Hamilton-Jacobi equation.
arXiv Detail & Related papers (2020-10-15T22:43:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.