Related papers: PETScML: Second-order solvers for training regression problems in Scientific Machine Learning

PETScML: Second-order solvers for training regression problems in Scientific Machine Learning

URL: http://arxiv.org/abs/2403.12188v1
Date: Mon, 18 Mar 2024 18:59:42 GMT
Title: PETScML: Second-order solvers for training regression problems in Scientific Machine Learning
Authors: Stefano Zampini, Umberto Zerbinati, George Turkiyyah, David Keyes,
Abstract summary: In recent years, we have witnessed the emergence of scientific machine learning as a data-driven tool for the analysis. We introduce a software built on top of the Portable and Extensible Toolkit for Scientific computation to bridge the gap between deep-learning software and conventional machine-learning techniques.
Score: 0.22499166814992438
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In recent years, we have witnessed the emergence of scientific machine learning as a data-driven tool for the analysis, by means of deep-learning techniques, of data produced by computational science and engineering applications. At the core of these methods is the supervised training algorithm to learn the neural network realization, a highly non-convex optimization problem that is usually solved using stochastic gradient methods. However, distinct from deep-learning practice, scientific machine-learning training problems feature a much larger volume of smooth data and better characterizations of the empirical risk functions, which make them suited for conventional solvers for unconstrained optimization. We introduce a lightweight software framework built on top of the Portable and Extensible Toolkit for Scientific computation to bridge the gap between deep-learning software and conventional solvers for unconstrained minimization. We empirically demonstrate the superior efficacy of a trust region method based on the Gauss-Newton approximation of the Hessian in improving the generalization errors arising from regression tasks when learning surrogate models for a wide range of scientific machine-learning techniques and test cases. All the conventional second-order solvers tested, including L-BFGS and inexact Newton with line-search, compare favorably, either in terms of cost or accuracy, with the adaptive first-order methods used to validate the surrogate models.

Related papers

A Simultaneous Approach for Training Neural Differential-Algebraic Systems of Equations [0.4935512063616847]
We study neural differential-algebraic systems of equations (DAEs), where some unknown relationships are learned from data. We apply the simultaneous approach to neural DAE problems, resulting in a fully discretized nonlinear optimization problem. We achieve promising results in terms of accuracy, model generalizability and computational cost, across different problem settings.
arXiv Detail & Related papers (2025-04-07T01:26:55Z)
What Really Matters for Learning-based LiDAR-Camera Calibration [50.2608502974106]
This paper revisits the development of learning-based LiDAR-Camera calibration. We identify the critical limitations of regression-based methods with the widely used data generation pipeline. We also investigate how the input data format and preprocessing operations impact network performance.
arXiv Detail & Related papers (2025-01-28T14:12:32Z)
Linearly Convergent Mixup Learning [0.0]
We present two novel algorithms that extend to a broader range of binary classification models. Unlike gradient-based approaches, our algorithms do not require hyper parameters like learning rates, simplifying their implementation and optimization. Our algorithms achieve faster convergence to the optimal solution compared to descent gradient approaches, and that mixup data augmentation consistently improves the predictive performance across various loss functions.
arXiv Detail & Related papers (2025-01-14T02:33:40Z)
Unlearning as multi-task optimization: A normalized gradient difference approach with an adaptive learning rate [105.86576388991713]
We introduce a normalized gradient difference (NGDiff) algorithm, enabling us to have better control over the trade-off between the objectives. We provide a theoretical analysis and empirically demonstrate the superior performance of NGDiff among state-of-the-art unlearning methods on the TOFU and MUSE datasets.
arXiv Detail & Related papers (2024-10-29T14:41:44Z)
Faster Machine Unlearning via Natural Gradient Descent [2.3020018305241337]
We address the challenge of efficiently deleting data from machine learning models using Empirical Risk Minimization (ERM), a process as machine unlearning. To avoid scratch, we propose a novel leveraging Natural Gradient Descent (D) algorithms.
arXiv Detail & Related papers (2024-07-11T04:19:28Z)
Learning Controllable Adaptive Simulation for Multi-resolution Physics [86.8993558124143]
We introduce Learning controllable Adaptive simulation for Multi-resolution Physics (LAMP) as the first full deep learning-based surrogate model. LAMP consists of a Graph Neural Network (GNN) for learning the forward evolution, and a GNN-based actor-critic for learning the policy of spatial refinement and coarsening. We demonstrate that our LAMP outperforms state-of-the-art deep learning surrogate models, and can adaptively trade-off computation to improve long-term prediction error.
arXiv Detail & Related papers (2023-05-01T23:20:27Z)
On Robust Numerical Solver for ODE via Self-Attention Mechanism [82.95493796476767]
We explore training efficient and robust AI-enhanced numerical solvers with a small data size by mitigating intrinsic noise disturbances. We first analyze the ability of the self-attention mechanism to regulate noise in supervised learning and then propose a simple-yet-effective numerical solver, Attr, which introduces an additive self-attention mechanism to the numerical solution of differential equations.
arXiv Detail & Related papers (2023-02-05T01:39:21Z)
Learning Large-scale Neural Fields via Context Pruned Meta-Learning [60.93679437452872]
We introduce an efficient optimization-based meta-learning technique for large-scale neural field training. We show how gradient re-scaling at meta-test time allows the learning of extremely high-quality neural fields. Our framework is model-agnostic, intuitive, straightforward to implement, and shows significant reconstruction improvements for a wide range of signals.
arXiv Detail & Related papers (2023-02-01T17:32:16Z)
A Novel Plug-and-Play Approach for Adversarially Robust Generalization [38.72514422694518]
We propose a robust framework that employs adversarially robust training to safeguard the ML models against perturbed testing data. Our contributions can be seen from both computational and statistical perspectives.
arXiv Detail & Related papers (2022-08-19T17:02:55Z)
Physical Gradients for Deep Learning [101.36788327318669]
We find that state-of-the-art training techniques are not well-suited to many problems that involve physical processes. We propose a novel hybrid training approach that combines higher-order optimization methods with machine learning techniques.
arXiv Detail & Related papers (2021-09-30T12:14:31Z)
Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks. This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z)
Application of an automated machine learning-genetic algorithm (AutoML-GA) coupled with computational fluid dynamics simulations for rapid engine design optimization [0.0]
The present work describes and validates an automated active learning approach, AutoML-GA, for surrogate-based optimization of internal combustion engines. A genetic algorithm is employed to locate the design optimum on the machine learning surrogate surface. It is demonstrated that AutoML-GA leads to a better optimum with a lower number of CFD simulations.
arXiv Detail & Related papers (2021-01-07T17:50:52Z)
Optimization for Supervised Machine Learning: Randomized Algorithms for Data and Parameters [10.279748604797911]
Key problems in machine learning and data science are routinely modeled as optimization problems and solved via optimization algorithms. With the increase of the volume of data and the size and complexity of the statistical models used to formulate these often ill-conditioned optimization tasks, there is a need for new efficient algorithms able to cope with these challenges. In this thesis, we deal with each of these sources of difficulty in a different way. To efficiently address the big data issue, we develop new methods which in each iteration examine a small random subset of the training data only. To handle the big model issue, we develop methods which in each iteration update
arXiv Detail & Related papers (2020-08-26T21:15:18Z)
Designing Accurate Emulators for Scientific Processes using Calibration-Driven Deep Models [33.935755695805724]
Learn-by-Calibrating (LbC) is a novel deep learning approach for designing emulators in scientific applications. We show that LbC provides significant improvements in generalization error over widely-adopted loss function choices. LbC achieves high-quality emulators even in small data regimes and more importantly, recovers the inherent noise structure without any explicit priors.
arXiv Detail & Related papers (2020-05-05T16:54:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.