Related papers: Learning via nonlinear conjugate gradients and depth-varying neural ODEs

Learning via nonlinear conjugate gradients and depth-varying neural ODEs

URL: http://arxiv.org/abs/2202.05766v1
Date: Fri, 11 Feb 2022 17:00:48 GMT
Title: Learning via nonlinear conjugate gradients and depth-varying neural ODEs
Authors: George Baravdish, Gabriel Eilertsen, Rym Jaroudi, B. Tomas Johansson, Luk\'a\v{s} Mal\'y and Jonas Unger
Abstract summary: The inverse problem of supervised reconstruction of depth-variable parameters in a neural ordinary differential equation (NODE) is considered. The proposed parameter reconstruction is done for a general first order differential equation by minimizing a cost functional. The sensitivity problem can estimate changes in the network output under perturbation of the trained parameters.
Score: 5.565364597145568
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The inverse problem of supervised reconstruction of depth-variable (time-dependent) parameters in a neural ordinary differential equation (NODE) is considered, that means finding the weights of a residual network with time continuous layers. The NODE is treated as an isolated entity describing the full network as opposed to earlier research, which embedded it between pre- and post-appended layers trained by conventional methods. The proposed parameter reconstruction is done for a general first order differential equation by minimizing a cost functional covering a variety of loss functions and penalty terms. A nonlinear conjugate gradient method (NCG) is derived for the minimization. Mathematical properties are stated for the differential equation and the cost functional. The adjoint problem needed is derived together with a sensitivity problem. The sensitivity problem can estimate changes in the network output under perturbation of the trained parameters. To preserve smoothness during the iterations the Sobolev gradient is calculated and incorporated. As a proof-of-concept, numerical results are included for a NODE and two synthetic datasets, and compared with standard gradient approaches (not based on NODEs). The results show that the proposed method works well for deep learning with infinite numbers of layers, and has built-in stability and smoothness.

Related papers

Decentralized Nonconvex Composite Federated Learning with Gradient Tracking and Momentum [78.27945336558987]
Decentralized server (DFL) eliminates reliance on client-client architecture. Non-smooth regularization is often incorporated into machine learning tasks. We propose a novel novel DNCFL algorithm to solve these problems.
arXiv Detail & Related papers (2025-04-17T08:32:25Z)
A Physics-Informed Meta-Learning Framework for the Continuous Solution of Parametric PDEs on Arbitrary Geometries [0.0]
We introduce implicit Finite Operator Learning (iFOL) for the continuous and parametric solution of partial differential equations (PDEs) on arbitrary geometries. We propose a physics-informed encoder-decoder network to establish the mapping between continuous parameter and solution spaces. We critically assess these features and analyze the network's ability to generalize to unseen samples across both stationary and transient PDEs.
arXiv Detail & Related papers (2025-04-03T10:24:00Z)
A Natural Primal-Dual Hybrid Gradient Method for Adversarial Neural Network Training on Solving Partial Differential Equations [9.588717577573684]
We propose a scalable preconditioned primal hybrid gradient algorithm for solving partial differential equations (PDEs) We compare the performance of the proposed method with several commonly used deep learning algorithms. The numerical results suggest that the proposed method performs efficiently and robustly and converges more stably.
arXiv Detail & Related papers (2024-11-09T20:39:10Z)
FEM-based Neural Networks for Solving Incompressible Fluid Flows and Related Inverse Problems [41.94295877935867]
numerical simulation and optimization of technical systems described by partial differential equations is expensive. A comparatively new approach in this context is to combine the good approximation properties of neural networks with the classical finite element method. In this paper, we extend this approach to saddle-point and non-linear fluid dynamics problems, respectively.
arXiv Detail & Related papers (2024-09-06T07:17:01Z)
A Nonoverlapping Domain Decomposition Method for Extreme Learning Machines: Elliptic Problems [0.0]
Extreme learning machine (ELM) is a methodology for solving partial differential equations (PDEs) using a single hidden layer feed-forward neural network. In this paper, we propose a nonoverlapping domain decomposition method (DDM) for ELMs that not only reduces the training time of ELMs, but is also suitable for parallel computation.
arXiv Detail & Related papers (2024-06-22T23:25:54Z)
Neural variational Data Assimilation with Uncertainty Quantification using SPDE priors [28.804041716140194]
Recent advances in the deep learning community enables to address the problem through a neural architecture a variational data assimilation framework. In this work we use the theory of Partial Differential Equations (SPDE) and Gaussian Processes (GP) to estimate both space-and time covariance of the state.
arXiv Detail & Related papers (2024-02-02T19:18:12Z)
Learning Discretized Neural Networks under Ricci Flow [51.36292559262042]
We study Discretized Neural Networks (DNNs) composed of low-precision weights and activations. DNNs suffer from either infinite or zero gradients due to the non-differentiable discrete function during training.
arXiv Detail & Related papers (2023-02-07T10:51:53Z)
Message Passing Neural PDE Solvers [60.77761603258397]
We build a neural message passing solver, replacing allally designed components in the graph with backprop-optimized neural function approximators. We show that neural message passing solvers representationally contain some classical methods, such as finite differences, finite volumes, and WENO schemes. We validate our method on various fluid-like flow problems, demonstrating fast, stable, and accurate performance across different domain topologies, equation parameters, discretizations, etc., in 1D and 2D.
arXiv Detail & Related papers (2022-02-07T17:47:46Z)
Improved Overparametrization Bounds for Global Convergence of Stochastic Gradient Descent for Shallow Neural Networks [1.14219428942199]
We study the overparametrization bounds required for the global convergence of gradient descent algorithm for a class of one hidden layer feed-forward neural networks.
arXiv Detail & Related papers (2022-01-28T11:30:06Z)
Mean-field Analysis of Piecewise Linear Solutions for Wide ReLU Networks [83.58049517083138]
We consider a two-layer ReLU network trained via gradient descent. We show that SGD is biased towards a simple solution. We also provide empirical evidence that knots at locations distinct from the data points might occur.
arXiv Detail & Related papers (2021-11-03T15:14:20Z)
Cogradient Descent for Dependable Learning [64.02052988844301]
We propose a dependable learning based on Cogradient Descent (CoGD) algorithm to address the bilinear optimization problem. CoGD is introduced to solve bilinear problems when one variable is with sparsity constraint. It can also be used to decompose the association of features and weights, which further generalizes our method to better train convolutional neural networks (CNNs)
arXiv Detail & Related papers (2021-06-20T04:28:20Z)
Provably Efficient Neural Estimation of Structural Equation Model: An Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs) We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent. For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z)
Cogradient Descent for Bilinear Optimization [124.45816011848096]
We introduce a Cogradient Descent algorithm (CoGD) to address the bilinear problem. We solve one variable by considering its coupling relationship with the other, leading to a synchronous gradient descent. Our algorithm is applied to solve problems with one variable under the sparsity constraint.
arXiv Detail & Related papers (2020-06-16T13:41:54Z)
Solving inverse-PDE problems with physics-aware neural networks [0.0]
We propose a novel framework to find unknown fields in the context of inverse problems for partial differential equations. We blend the high expressibility of deep neural networks as universal function estimators with the accuracy and reliability of existing numerical algorithms.
arXiv Detail & Related papers (2020-01-10T18:46:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.