PDE+: Enhancing Generalization via PDE with Adaptive Distributional
Diffusion
- URL: http://arxiv.org/abs/2305.15835v2
- Date: Fri, 15 Dec 2023 05:46:52 GMT
- Title: PDE+: Enhancing Generalization via PDE with Adaptive Distributional
Diffusion
- Authors: Yige Yuan, Bingbing Xu, Bo Lin, Liang Hou, Fei Sun, Huawei Shen, Xueqi
Cheng
- Abstract summary: generalization of neural networks is a central challenge in machine learning.
We propose to enhance it directly through the underlying function of neural networks, rather than focusing on adjusting input data.
We put this theoretical framework into practice as $textbfPDE+$ ($textbfPDE$ with $textbfA$daptive $textbfD$istributional $textbfD$iffusion)
- Score: 66.95761172711073
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The generalization of neural networks is a central challenge in machine
learning, especially concerning the performance under distributions that differ
from training ones. Current methods, mainly based on the data-driven paradigm
such as data augmentation, adversarial training, and noise injection, may
encounter limited generalization due to model non-smoothness. In this paper, we
propose to investigate generalization from a Partial Differential Equation
(PDE) perspective, aiming to enhance it directly through the underlying
function of neural networks, rather than focusing on adjusting input data.
Specifically, we first establish the connection between neural network
generalization and the smoothness of the solution to a specific PDE, namely
"transport equation". Building upon this, we propose a general framework that
introduces adaptive distributional diffusion into transport equation to enhance
the smoothness of its solution, thereby improving generalization. In the
context of neural networks, we put this theoretical framework into practice as
$\textbf{PDE+}$ ($\textbf{PDE}$ with $\textbf{A}$daptive
$\textbf{D}$istributional $\textbf{D}$iffusion) which diffuses each sample into
a distribution covering semantically similar inputs. This enables better
coverage of potentially unobserved distributions in training, thus improving
generalization beyond merely data-driven methods. The effectiveness of PDE+ is
validated through extensive experimental settings, demonstrating its superior
performance compared to SOTA methods.
Related papers
- EntAugment: Entropy-Driven Adaptive Data Augmentation Framework for Image Classification [10.334396596691048]
We propose EntAugment, a tuning-free and adaptive DA framework.
It dynamically assesses and adjusts the augmentation magnitudes for each sample during training.
We also introduce a novel entropy regularization term, EntLoss, which complements the EntAugment approach.
arXiv Detail & Related papers (2024-09-10T07:42:47Z) - DiffSG: A Generative Solver for Network Optimization with Diffusion Model [75.27274046562806]
Diffusion generative models can consider a broader range of solutions and exhibit stronger generalization by learning parameters.
We propose a new framework, which leverages intrinsic distribution learning of diffusion generative models to learn high-quality solutions.
arXiv Detail & Related papers (2024-08-13T07:56:21Z) - DiffusionPDE: Generative PDE-Solving Under Partial Observation [10.87702379899977]
We introduce a general framework for solving partial differential equations (PDEs) using generative diffusion models.
We show that the learned generative priors lead to a versatile framework for accurately solving a wide range of PDEs under partial observation.
arXiv Detail & Related papers (2024-06-25T17:48:24Z) - Amortizing intractable inference in diffusion models for vision, language, and control [89.65631572949702]
This paper studies amortized sampling of the posterior over data, $mathbfxsim prm post(mathbfx)propto p(mathbfx)r(mathbfx)$, in a model that consists of a diffusion generative model prior $p(mathbfx)$ and a black-box constraint or function $r(mathbfx)$.
We prove the correctness of a data-free learning objective, relative trajectory balance, for training a diffusion model that samples from
arXiv Detail & Related papers (2024-05-31T16:18:46Z) - Theoretical Insights for Diffusion Guidance: A Case Study for Gaussian
Mixture Models [59.331993845831946]
Diffusion models benefit from instillation of task-specific information into the score function to steer the sample generation towards desired properties.
This paper provides the first theoretical study towards understanding the influence of guidance on diffusion models in the context of Gaussian mixture models.
arXiv Detail & Related papers (2024-03-03T23:15:48Z) - DSCom: A Data-Driven Self-Adaptive Community-Based Framework for
Influence Maximization in Social Networks [3.97535858363999]
We reformulate the problem on the attributed network and leverage the node attributes to estimate the closeness between connected nodes.
Specifically, we propose a machine learning-based framework, named DSCom, to address this problem.
Compared to the previous theoretical works, we carefully designed empirical experiments with parameterized diffusion models based on real-world social networks.
arXiv Detail & Related papers (2023-11-18T14:03:43Z) - Distributed Variational Inference for Online Supervised Learning [15.038649101409804]
This paper develops a scalable distributed probabilistic inference algorithm.
It applies to continuous variables, intractable posteriors and large-scale real-time data in sensor networks.
arXiv Detail & Related papers (2023-09-05T22:33:02Z) - Re-parameterizing VAEs for stability [1.90365714903665]
We propose a theoretical approach towards the training numerical stability of Variational AutoEncoders (VAE)
Our work is motivated by recent studies empowering VAEs to reach state of the art generative results on complex image datasets.
We show that by implementing small changes to the way we parameterize the Normal distributions on which they rely, VAEs can securely be trained.
arXiv Detail & Related papers (2021-06-25T16:19:09Z) - Decentralized Local Stochastic Extra-Gradient for Variational
Inequalities [125.62877849447729]
We consider distributed variational inequalities (VIs) on domains with the problem data that is heterogeneous (non-IID) and distributed across many devices.
We make a very general assumption on the computational network that covers the settings of fully decentralized calculations.
We theoretically analyze its convergence rate in the strongly-monotone, monotone, and non-monotone settings.
arXiv Detail & Related papers (2021-06-15T17:45:51Z) - Network Diffusions via Neural Mean-Field Dynamics [52.091487866968286]
We propose a novel learning framework for inference and estimation problems of diffusion on networks.
Our framework is derived from the Mori-Zwanzig formalism to obtain an exact evolution of the node infection probabilities.
Our approach is versatile and robust to variations of the underlying diffusion network models.
arXiv Detail & Related papers (2020-06-16T18:45:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.