KO: Kinetics-inspired Neural Optimizer with PDE Simulation Approaches
- URL: http://arxiv.org/abs/2505.14777v1
- Date: Tue, 20 May 2025 18:00:01 GMT
- Title: KO: Kinetics-inspired Neural Optimizer with PDE Simulation Approaches
- Authors: Mingquan Feng, Yixin Huang, Yifan Fu, Shaobo Wang, Junchi Yan,
- Abstract summary: This paper introduces KO, a novel neural gradient inspired by kinetic theory and partial differential equation (PDE) simulations.<n>We reimagine the dynamics of network parameters as the evolution of a particle system governed by kinetic principles.<n>This physics-driven approach inherently promotes parameter diversity during optimization, mitigating the phenomenon of parameter condensation.
- Score: 45.173398806932376
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The design of optimization algorithms for neural networks remains a critical challenge, with most existing methods relying on heuristic adaptations of gradient-based approaches. This paper introduces KO (Kinetics-inspired Optimizer), a novel neural optimizer inspired by kinetic theory and partial differential equation (PDE) simulations. We reimagine the training dynamics of network parameters as the evolution of a particle system governed by kinetic principles, where parameter updates are simulated via a numerical scheme for the Boltzmann transport equation (BTE) that models stochastic particle collisions. This physics-driven approach inherently promotes parameter diversity during optimization, mitigating the phenomenon of parameter condensation, i.e. collapse of network parameters into low-dimensional subspaces, through mechanisms analogous to thermal diffusion in physical systems. We analyze this property, establishing both a mathematical proof and a physical interpretation. Extensive experiments on image classification (CIFAR-10/100, ImageNet) and text classification (IMDB, Snips) tasks demonstrate that KO consistently outperforms baseline optimizers (e.g., Adam, SGD), achieving accuracy improvements while computation cost remains comparable.
Related papers
- Self-Supervised Coarsening of Unstructured Grid with Automatic Differentiation [55.88862563823878]
In this work, we present an original algorithm to coarsen an unstructured grid based on the concepts of differentiable physics.<n>We demonstrate performance of the algorithm on two PDEs: a linear equation which governs slightly compressible fluid flow in porous media and the wave equation.<n>Our results show that in the considered scenarios, we reduced the number of grid points up to 10 times while preserving the modeled variable dynamics in the points of interest.
arXiv Detail & Related papers (2025-07-24T11:02:13Z) - Structure and asymptotic preserving deep neural surrogates for uncertainty quantification in multiscale kinetic equations [5.181697052513637]
High dimensionality of kinetic equations with parameters poses computational challenges for uncertainty quantification (UQ)<n>Traditional Monte Carlo (MC) sampling methods suffer from slow convergence and high variance, which become increasingly severe as the dimensionality of the space grows.<n>We introduce surrogate models based on structure and preserving neural networks (SAPNNs)<n>SAPNNs are specifically designed to satisfy key physical properties, including positivity, conservation laws, entropy dissipation, parameter limits.
arXiv Detail & Related papers (2025-06-12T12:20:53Z) - KITINet: Kinetics Theory Inspired Network Architectures with PDE Simulation Approaches [43.872190335490515]
This paper introduces KITINet, a novel architecture that reinterprets feature propagation through the lens of non-equilibrium particle dynamics.<n>At its core, we propose a residual module that models update as the evolution of a particle system.<n>This formulation mimics particle collisions and energy exchange, enabling adaptive feature refinement via physics-informed interactions.
arXiv Detail & Related papers (2025-05-23T13:58:29Z) - Gradual Optimization Learning for Conformational Energy Minimization [69.36925478047682]
Gradual Optimization Learning Framework (GOLF) for energy minimization with neural networks significantly reduces the required additional data.
Our results demonstrate that the neural network trained with GOLF performs on par with the oracle on a benchmark of diverse drug-like molecules.
arXiv Detail & Related papers (2023-11-05T11:48:08Z) - An Optimization-based Deep Equilibrium Model for Hyperspectral Image
Deconvolution with Convergence Guarantees [71.57324258813675]
We propose a novel methodology for addressing the hyperspectral image deconvolution problem.
A new optimization problem is formulated, leveraging a learnable regularizer in the form of a neural network.
The derived iterative solver is then expressed as a fixed-point calculation problem within the Deep Equilibrium framework.
arXiv Detail & Related papers (2023-06-10T08:25:16Z) - NAG-GS: Semi-Implicit, Accelerated and Robust Stochastic Optimizer [45.47667026025716]
We propose a novel, robust and accelerated iteration that relies on two key elements.
The convergence and stability of the obtained method, referred to as NAG-GS, are first studied extensively.
We show that NAG-arity is competitive with state-the-art methods such as momentum SGD with weight decay and AdamW for the training of machine learning models.
arXiv Detail & Related papers (2022-09-29T16:54:53Z) - Neural Operator with Regularity Structure for Modeling Dynamics Driven
by SPDEs [70.51212431290611]
Partial differential equations (SPDEs) are significant tools for modeling dynamics in many areas including atmospheric sciences and physics.
We propose the Neural Operator with Regularity Structure (NORS) which incorporates the feature vectors for modeling dynamics driven by SPDEs.
We conduct experiments on various of SPDEs including the dynamic Phi41 model and the 2d Navier-Stokes equation.
arXiv Detail & Related papers (2022-04-13T08:53:41Z) - Equivariant vector field network for many-body system modeling [65.22203086172019]
Equivariant Vector Field Network (EVFN) is built on a novel equivariant basis and the associated scalarization and vectorization layers.
We evaluate our method on predicting trajectories of simulated Newton mechanics systems with both full and partially observed data.
arXiv Detail & Related papers (2021-10-26T14:26:25Z) - Physics informed machine learning with Smoothed Particle Hydrodynamics:
Hierarchy of reduced Lagrangian models of turbulence [0.6542219246821327]
This manuscript develops a hierarchy of parameterized reduced Lagrangian models for turbulent flows.
It investigates the effects of enforcing physical structure through Smoothed Particle Hydrodynamics (SPH) versus relying on neural networks (NN)s as universal function approximators.
arXiv Detail & Related papers (2021-10-25T22:57:40Z) - Physics-Informed Neural Network Method for Solving One-Dimensional
Advection Equation Using PyTorch [0.0]
PINNs approach allows training neural networks while respecting the PDEs as a strong constraint in the optimization.
In standard small-scale circulation simulations, it is shown that the conventional approach incorporates a pseudo diffusive effect that is almost as large as the effect of the turbulent diffusion model.
Of all the schemes tested, only the PINNs approximation accurately predicted the outcome.
arXiv Detail & Related papers (2021-03-15T05:39:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.