Related papers: A Two-Stage Training Method for Modeling Constrained Systems With Neural Networks

A Two-Stage Training Method for Modeling Constrained Systems With Neural Networks

URL: http://arxiv.org/abs/2403.02730v1
Date: Tue, 5 Mar 2024 07:37:47 GMT
Title: A Two-Stage Training Method for Modeling Constrained Systems With Neural Networks
Authors: C. Coelho, M. Fernanda P. Costa, L.L. Ferr\'as
Abstract summary: This paper describes in detail the two-stage training method for Neural ODEs. The first stage aims at finding feasible NN parameters by minimizing a measure of constraints violation. The second stage aims to find the optimal NN parameters by minimizing the loss function while keeping inside the feasible region.
Score: 3.072340427031969
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Real-world systems are often formulated as constrained optimization problems. Techniques to incorporate constraints into Neural Networks (NN), such as Neural Ordinary Differential Equations (Neural ODEs), have been used. However, these introduce hyperparameters that require manual tuning through trial and error, raising doubts about the successful incorporation of constraints into the generated model. This paper describes in detail the two-stage training method for Neural ODEs, a simple, effective, and penalty parameter-free approach to model constrained systems. In this approach the constrained optimization problem is rewritten as two unconstrained sub-problems that are solved in two stages. The first stage aims at finding feasible NN parameters by minimizing a measure of constraints violation. The second stage aims to find the optimal NN parameters by minimizing the loss function while keeping inside the feasible region. We experimentally demonstrate that our method produces models that satisfy the constraints and also improves their predictive performance. Thus, ensuring compliance with critical system properties and also contributing to reducing data quantity requirements. Furthermore, we show that the proposed method improves the convergence to an optimal solution and improves the explainability of Neural ODE models. Our proposed two-stage training method can be used with any NN architectures.

Related papers

Semi-Explicit Neural DAEs: Learning Long-Horizon Dynamical Systems with Algebraic Constraints [2.66269503676104]
We propose a method that explicitly enforces algebraic constraints by projecting each ODE step onto the constraint manifold.<n>PNODEs consistently outperform baselines across six benchmark problems achieving a mean constraint violation error below $10-10$.<n>These results show that constraint projection offers a simple strategy for learning physically consistent long-horizon dynamics.
arXiv Detail & Related papers (2025-05-26T20:31:15Z)
Certified Neural Approximations of Nonlinear Dynamics [52.79163248326912]
In safety-critical contexts, the use of neural approximations requires formal bounds on their closeness to the underlying system.<n>We propose a novel, adaptive, and parallelizable verification method based on certified first-order models.
arXiv Detail & Related papers (2025-05-21T13:22:20Z)
A First-order Generative Bilevel Optimization Framework for Diffusion Models [57.40597004445473]
Diffusion models iteratively denoise data samples to synthesize high-quality outputs. Traditional bilevel methods fail due to infinite-dimensional probability space and prohibitive sampling costs. We formalize this challenge as a generative bilevel optimization problem. Our first-order bilevel framework overcomes the incompatibility of conventional bilevel methods with diffusion processes.
arXiv Detail & Related papers (2025-02-12T21:44:06Z)
Training Deep Learning Models with Norm-Constrained LMOs [56.00317694850397]
We study optimization methods that leverage the linear minimization oracle (LMO) over a norm-ball. We propose a new family of algorithms that uses the LMO to adapt to the geometry of the problem and, perhaps surprisingly, show that they can be applied to unconstrained problems.
arXiv Detail & Related papers (2025-02-11T13:10:34Z)
WANCO: Weak Adversarial Networks for Constrained Optimization problems [5.257895611010853]
We first transform minimax problems into minimax problems using the augmented Lagrangian method. We then use two (or several) deep neural networks to represent the primal and dual variables respectively. The parameters in the neural networks are then trained by an adversarial process.
arXiv Detail & Related papers (2024-07-04T05:37:48Z)
Neural Parameter Regression for Explicit Representations of PDE Solution Operators [22.355460388065964]
We introduce Neural Regression (NPR), a novel framework specifically developed for learning solution operators in Partial Differential Equations (PDEs) NPR employs Physics-Informed Neural Network (PINN, Raissi et al., 2021) techniques to regress Neural Network (NN) parameters. The framework shows remarkable adaptability to new initial and boundary conditions, allowing for rapid fine-tuning and inference.
arXiv Detail & Related papers (2024-03-19T14:30:56Z)
Solving Inverse Problems with Model Mismatch using Untrained Neural Networks within Model-based Architectures [14.551812310439004]
We introduce an untrained forward model residual block within the model-based architecture to match the data consistency in the measurement domain for each instance. Our approach offers a unified solution that is less parameter-sensitive, requires no additional data, and enables simultaneous fitting of the forward model and reconstruction in a single pass.
arXiv Detail & Related papers (2024-03-07T19:02:13Z)
Learning Constrained Optimization with Deep Augmented Lagrangian Methods [54.22290715244502]
A machine learning (ML) model is trained to emulate a constrained optimization solver. This paper proposes an alternative approach, in which the ML model is trained to predict dual solution estimates directly. It enables an end-to-end training scheme is which the dual objective is as a loss function, and solution estimates toward primal feasibility, emulating a Dual Ascent method.
arXiv Detail & Related papers (2024-03-06T04:43:22Z)
Neural Fields with Hard Constraints of Arbitrary Differential Order [61.49418682745144]
We develop a series of approaches for enforcing hard constraints on neural fields. The constraints can be specified as a linear operator applied to the neural field and its derivatives. Our approaches are demonstrated in a wide range of real-world applications.
arXiv Detail & Related papers (2023-06-15T08:33:52Z)
An Optimization-based Deep Equilibrium Model for Hyperspectral Image Deconvolution with Convergence Guarantees [71.57324258813675]
We propose a novel methodology for addressing the hyperspectral image deconvolution problem. A new optimization problem is formulated, leveraging a learnable regularizer in the form of a neural network. The derived iterative solver is then expressed as a fixed-point calculation problem within the Deep Equilibrium framework.
arXiv Detail & Related papers (2023-06-10T08:25:16Z)
A Stable and Scalable Method for Solving Initial Value PDEs with Neural Networks [52.5899851000193]
We develop an ODE based IVP solver which prevents the network from getting ill-conditioned and runs in time linear in the number of parameters. We show that current methods based on this approach suffer from two key issues. First, following the ODE produces an uncontrolled growth in the conditioning of the problem, ultimately leading to unacceptably large numerical errors.
arXiv Detail & Related papers (2023-04-28T17:28:18Z)
TAOTF: A Two-stage Approximately Orthogonal Training Framework in Deep Neural Networks [8.663152066918821]
We propose a novel two-stage approximately orthogonal training framework (TAOTF) to solve the problem in noisy data scenarios. We evaluate the proposed model-agnostic framework both on the natural image and medical image datasets, which show that our method achieves stable and superior performances to existing methods.
arXiv Detail & Related papers (2022-11-25T05:22:43Z)
Characterizing possible failure modes in physics-informed neural networks [55.83255669840384]
Recent work in scientific machine learning has developed so-called physics-informed neural network (PINN) models. We demonstrate that, while existing PINN methodologies can learn good models for relatively trivial problems, they can easily fail to learn relevant physical phenomena even for simple PDEs. We show that these possible failure modes are not due to the lack of expressivity in the NN architecture, but that the PINN's setup makes the loss landscape very hard to optimize.
arXiv Detail & Related papers (2021-09-02T16:06:45Z)
Accelerating Neural ODEs Using Model Order Reduction [0.0]
We show that mathematical model order reduction methods can be used for compressing and accelerating Neural ODEs. We implement our novel compression method by developing Neural ODEs that integrate the necessary subspace-projection and operations as layers of the neural network.
arXiv Detail & Related papers (2021-05-28T19:27:09Z)
dNNsolve: an efficient NN-based PDE solver [62.997667081978825]
We introduce dNNsolve, that makes use of dual Neural Networks to solve ODEs/PDEs. We show that dNNsolve is capable of solving a broad range of ODEs/PDEs in 1, 2 and 3 spacetime dimensions.
arXiv Detail & Related papers (2021-03-15T19:14:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.