A Two-Stage Training Method for Modeling Constrained Systems With Neural
Networks
- URL: http://arxiv.org/abs/2403.02730v1
- Date: Tue, 5 Mar 2024 07:37:47 GMT
- Title: A Two-Stage Training Method for Modeling Constrained Systems With Neural
Networks
- Authors: C. Coelho, M. Fernanda P. Costa, L.L. Ferr\'as
- Abstract summary: This paper describes in detail the two-stage training method for Neural ODEs.
The first stage aims at finding feasible NN parameters by minimizing a measure of constraints violation.
The second stage aims to find the optimal NN parameters by minimizing the loss function while keeping inside the feasible region.
- Score: 3.072340427031969
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Real-world systems are often formulated as constrained optimization problems.
Techniques to incorporate constraints into Neural Networks (NN), such as Neural
Ordinary Differential Equations (Neural ODEs), have been used. However, these
introduce hyperparameters that require manual tuning through trial and error,
raising doubts about the successful incorporation of constraints into the
generated model. This paper describes in detail the two-stage training method
for Neural ODEs, a simple, effective, and penalty parameter-free approach to
model constrained systems. In this approach the constrained optimization
problem is rewritten as two unconstrained sub-problems that are solved in two
stages. The first stage aims at finding feasible NN parameters by minimizing a
measure of constraints violation. The second stage aims to find the optimal NN
parameters by minimizing the loss function while keeping inside the feasible
region. We experimentally demonstrate that our method produces models that
satisfy the constraints and also improves their predictive performance. Thus,
ensuring compliance with critical system properties and also contributing to
reducing data quantity requirements. Furthermore, we show that the proposed
method improves the convergence to an optimal solution and improves the
explainability of Neural ODE models. Our proposed two-stage training method can
be used with any NN architectures.
Related papers
- WANCO: Weak Adversarial Networks for Constrained Optimization problems [5.257895611010853]
We first transform minimax problems into minimax problems using the augmented Lagrangian method.
We then use two (or several) deep neural networks to represent the primal and dual variables respectively.
The parameters in the neural networks are then trained by an adversarial process.
arXiv Detail & Related papers (2024-07-04T05:37:48Z) - Neural Parameter Regression for Explicit Representations of PDE Solution Operators [22.355460388065964]
We introduce Neural Regression (NPR), a novel framework specifically developed for learning solution operators in Partial Differential Equations (PDEs)
NPR employs Physics-Informed Neural Network (PINN, Raissi et al., 2021) techniques to regress Neural Network (NN) parameters.
The framework shows remarkable adaptability to new initial and boundary conditions, allowing for rapid fine-tuning and inference.
arXiv Detail & Related papers (2024-03-19T14:30:56Z) - Solving Inverse Problems with Model Mismatch using Untrained Neural Networks within Model-based Architectures [14.551812310439004]
We introduce an untrained forward model residual block within the model-based architecture to match the data consistency in the measurement domain for each instance.
Our approach offers a unified solution that is less parameter-sensitive, requires no additional data, and enables simultaneous fitting of the forward model and reconstruction in a single pass.
arXiv Detail & Related papers (2024-03-07T19:02:13Z) - Learning Constrained Optimization with Deep Augmented Lagrangian Methods [54.22290715244502]
A machine learning (ML) model is trained to emulate a constrained optimization solver.
This paper proposes an alternative approach, in which the ML model is trained to predict dual solution estimates directly.
It enables an end-to-end training scheme is which the dual objective is as a loss function, and solution estimates toward primal feasibility, emulating a Dual Ascent method.
arXiv Detail & Related papers (2024-03-06T04:43:22Z) - Neural Fields with Hard Constraints of Arbitrary Differential Order [61.49418682745144]
We develop a series of approaches for enforcing hard constraints on neural fields.
The constraints can be specified as a linear operator applied to the neural field and its derivatives.
Our approaches are demonstrated in a wide range of real-world applications.
arXiv Detail & Related papers (2023-06-15T08:33:52Z) - An Optimization-based Deep Equilibrium Model for Hyperspectral Image
Deconvolution with Convergence Guarantees [71.57324258813675]
We propose a novel methodology for addressing the hyperspectral image deconvolution problem.
A new optimization problem is formulated, leveraging a learnable regularizer in the form of a neural network.
The derived iterative solver is then expressed as a fixed-point calculation problem within the Deep Equilibrium framework.
arXiv Detail & Related papers (2023-06-10T08:25:16Z) - A Stable and Scalable Method for Solving Initial Value PDEs with Neural
Networks [52.5899851000193]
We develop an ODE based IVP solver which prevents the network from getting ill-conditioned and runs in time linear in the number of parameters.
We show that current methods based on this approach suffer from two key issues.
First, following the ODE produces an uncontrolled growth in the conditioning of the problem, ultimately leading to unacceptably large numerical errors.
arXiv Detail & Related papers (2023-04-28T17:28:18Z) - TAOTF: A Two-stage Approximately Orthogonal Training Framework in Deep
Neural Networks [8.663152066918821]
We propose a novel two-stage approximately orthogonal training framework (TAOTF) to solve the problem in noisy data scenarios.
We evaluate the proposed model-agnostic framework both on the natural image and medical image datasets, which show that our method achieves stable and superior performances to existing methods.
arXiv Detail & Related papers (2022-11-25T05:22:43Z) - Characterizing possible failure modes in physics-informed neural
networks [55.83255669840384]
Recent work in scientific machine learning has developed so-called physics-informed neural network (PINN) models.
We demonstrate that, while existing PINN methodologies can learn good models for relatively trivial problems, they can easily fail to learn relevant physical phenomena even for simple PDEs.
We show that these possible failure modes are not due to the lack of expressivity in the NN architecture, but that the PINN's setup makes the loss landscape very hard to optimize.
arXiv Detail & Related papers (2021-09-02T16:06:45Z) - Accelerating Neural ODEs Using Model Order Reduction [0.0]
We show that mathematical model order reduction methods can be used for compressing and accelerating Neural ODEs.
We implement our novel compression method by developing Neural ODEs that integrate the necessary subspace-projection and operations as layers of the neural network.
arXiv Detail & Related papers (2021-05-28T19:27:09Z) - dNNsolve: an efficient NN-based PDE solver [62.997667081978825]
We introduce dNNsolve, that makes use of dual Neural Networks to solve ODEs/PDEs.
We show that dNNsolve is capable of solving a broad range of ODEs/PDEs in 1, 2 and 3 spacetime dimensions.
arXiv Detail & Related papers (2021-03-15T19:14:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.