Convection-Diffusion Equation: A Theoretically Certified Framework for Neural Networks
- URL: http://arxiv.org/abs/2403.15726v1
- Date: Sat, 23 Mar 2024 05:26:36 GMT
- Title: Convection-Diffusion Equation: A Theoretically Certified Framework for Neural Networks
- Authors: Tangjun Wang, Chenglong Bao, Zuoqiang Shi,
- Abstract summary: We study the partial differential equation models of neural networks.
We show that this map can be formulated by a convection-diffusion equation.
We design a novel network structure, which incorporates diffusion mechanism into network architecture.
- Score: 14.01268607317875
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we study the partial differential equation models of neural networks. Neural network can be viewed as a map from a simple base model to a complicate function. Based on solid analysis, we show that this map can be formulated by a convection-diffusion equation. This theoretically certified framework gives mathematical foundation and more understanding of neural networks. Moreover, based on the convection-diffusion equation model, we design a novel network structure, which incorporates diffusion mechanism into network architecture. Extensive experiments on both benchmark datasets and real-world applications validate the performance of the proposed model.
Related papers
- Neural Network Parameter Diffusion [50.85251415173792]
Diffusion models have achieved remarkable success in image and video generation.
In this work, we demonstrate that diffusion models can also.
generate high-performing neural network parameters.
arXiv Detail & Related papers (2024-02-20T16:59:03Z) - Analyzing Neural Network-Based Generative Diffusion Models through Convex Optimization [45.72323731094864]
We present a theoretical framework to analyze two-layer neural network-based diffusion models.
We prove that training shallow neural networks for score prediction can be done by solving a single convex program.
Our results provide a precise characterization of what neural network-based diffusion models learn in non-asymptotic settings.
arXiv Detail & Related papers (2024-02-03T00:20:25Z) - An axiomatized PDE model of deep neural networks [12.82710074674]
Inspired by relation between deep neural network (DNN) and partial differential equations (PDEs), we study the general form of the PDE models of deep neural networks.
We prove that the evolution operator is actually determined by convection-diffusion equation.
We show that the convection-diffusion equation model improves the robustness and reduces the Rademacher complexity.
arXiv Detail & Related papers (2023-07-23T14:00:33Z) - Simple initialization and parametrization of sinusoidal networks via
their kernel bandwidth [92.25666446274188]
sinusoidal neural networks with activations have been proposed as an alternative to networks with traditional activation functions.
We first propose a simplified version of such sinusoidal neural networks, which allows both for easier practical implementation and simpler theoretical analysis.
We then analyze the behavior of these networks from the neural tangent kernel perspective and demonstrate that their kernel approximates a low-pass filter with an adjustable bandwidth.
arXiv Detail & Related papers (2022-11-26T07:41:48Z) - Quiver neural networks [5.076419064097734]
We develop a uniform theoretical approach towards the analysis of various neural network connectivity architectures.
Inspired by quiver representation theory in mathematics, this approach gives a compact way to capture elaborate data flows.
arXiv Detail & Related papers (2022-07-26T09:42:45Z) - Universal approximation property of invertible neural networks [76.95927093274392]
Invertible neural networks (INNs) are neural network architectures with invertibility by design.
Thanks to their invertibility and the tractability of Jacobian, INNs have various machine learning applications such as probabilistic modeling, generative modeling, and representation learning.
arXiv Detail & Related papers (2022-04-15T10:45:26Z) - Diffusion Mechanism in Residual Neural Network: Theory and Applications [12.573746641284849]
In many learning tasks with limited training samples, the diffusion connects the labeled and unlabeled data points.
We propose a novel diffusion residual network (Diff-ResNet) internally introduces diffusion into the architectures of neural networks.
Under the structured data assumption, it is proved that the proposed diffusion block can increase the distance-diameter ratio that improves the separability of inter-class points.
arXiv Detail & Related papers (2021-05-07T10:42:59Z) - Combining Differentiable PDE Solvers and Graph Neural Networks for Fluid
Flow Prediction [79.81193813215872]
We develop a hybrid (graph) neural network that combines a traditional graph convolutional network with an embedded differentiable fluid dynamics simulator inside the network itself.
We show that we can both generalize well to new situations and benefit from the substantial speedup of neural network CFD predictions.
arXiv Detail & Related papers (2020-07-08T21:23:19Z) - The Heterogeneity Hypothesis: Finding Layer-Wise Differentiated Network
Architectures [179.66117325866585]
We investigate a design space that is usually overlooked, i.e. adjusting the channel configurations of predefined networks.
We find that this adjustment can be achieved by shrinking widened baseline networks and leads to superior performance.
Experiments are conducted on various networks and datasets for image classification, visual tracking and image restoration.
arXiv Detail & Related papers (2020-06-29T17:59:26Z) - Network Diffusions via Neural Mean-Field Dynamics [52.091487866968286]
We propose a novel learning framework for inference and estimation problems of diffusion on networks.
Our framework is derived from the Mori-Zwanzig formalism to obtain an exact evolution of the node infection probabilities.
Our approach is versatile and robust to variations of the underlying diffusion network models.
arXiv Detail & Related papers (2020-06-16T18:45:20Z) - Learning Queuing Networks by Recurrent Neural Networks [0.0]
We propose a machine-learning approach to derive performance models from data.
We exploit a deterministic approximation of their average dynamics in terms of a compact system of ordinary differential equations.
This allows for an interpretable structure of the neural network, which can be trained from system measurements to yield a white-box parameterized model.
arXiv Detail & Related papers (2020-02-25T10:56:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.