Training Deep Surrogate Models with Large Scale Online Learning
- URL: http://arxiv.org/abs/2306.16133v1
- Date: Wed, 28 Jun 2023 12:02:27 GMT
- Title: Training Deep Surrogate Models with Large Scale Online Learning
- Authors: Lucas Meyer (EDF R\&D, SINCLAIR AI Lab, DATAMOVE ), Marc Schouler
(DATAMOVE ), Robert Alexander Caulk (DATAMOVE ), Alejandro Rib\'es (SINCLAIR
AI Lab, EDF R\&D), Bruno Raffin (DATAMOVE )
- Abstract summary: Deep learning algorithms have emerged as a viable alternative for obtaining fast solutions for PDEs.
Models are usually trained on synthetic data generated by solvers, stored on disk and read back for training.
It proposes an open source online training framework for deep surrogate models.
- Score: 48.7576911714538
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The spatiotemporal resolution of Partial Differential Equations (PDEs) plays
important roles in the mathematical description of the world's physical
phenomena. In general, scientists and engineers solve PDEs numerically by the
use of computationally demanding solvers. Recently, deep learning algorithms
have emerged as a viable alternative for obtaining fast solutions for PDEs.
Models are usually trained on synthetic data generated by solvers, stored on
disk and read back for training. This paper advocates that relying on a
traditional static dataset to train these models does not allow the full
benefit of the solver to be used as a data generator. It proposes an open
source online training framework for deep surrogate models. The framework
implements several levels of parallelism focused on simultaneously generating
numerical simulations and training deep neural networks. This approach
suppresses the I/O and storage bottleneck associated with disk-loaded datasets,
and opens the way to training on significantly larger datasets. Experiments
compare the offline and online training of four surrogate models, including
state-of-the-art architectures. Results indicate that exposing deep surrogate
models to more dataset diversity, up to hundreds of GB, can increase model
generalization capabilities. Fully connected neural networks, Fourier Neural
Operator (FNO), and Message Passing PDE Solver prediction accuracy is improved
by 68%, 16% and 7%, respectively.
Related papers
- Neuroexplicit Diffusion Models for Inpainting of Optical Flow Fields [8.282495481952784]
We show how to bring model- and data-driven approaches together by combining the explicit PDE-based approaches with convolutional neural networks.
Our model outperforms both fully explicit and fully data-driven baselines in terms of reconstruction quality, robustness and amount of required training data.
arXiv Detail & Related papers (2024-05-23T14:14:27Z) - Partitioned Neural Network Training via Synthetic Intermediate Labels [0.0]
GPU memory constraints have become a notable bottleneck in training such sizable models.
This study advocates partitioning the model across GPU and generating synthetic intermediate labels to train individual segments.
This approach results in a more efficient training process that minimizes data communication while maintaining model accuracy.
arXiv Detail & Related papers (2024-03-17T13:06:29Z) - DPOT: Auto-Regressive Denoising Operator Transformer for Large-Scale PDE Pre-Training [87.90342423839876]
We present a new auto-regressive denoising pre-training strategy, which allows for more stable and efficient pre-training on PDE data.
We train our PDE foundation model with up to 0.5B parameters on 10+ PDE datasets with more than 100k trajectories.
arXiv Detail & Related papers (2024-03-06T08:38:34Z) - Data-Efficient Operator Learning via Unsupervised Pretraining and In-Context Learning [45.78096783448304]
In this work, seeking data efficiency, we design unsupervised pretraining for PDE operator learning.
We mine unlabeled PDE data without simulated solutions, and we pretrain neural operators with physics-inspired reconstruction-based proxy tasks.
Our method is highly data-efficient, more generalizable, and even outperforms conventional vision-pretrained models.
arXiv Detail & Related papers (2024-02-24T06:27:33Z) - Federated Learning with Projected Trajectory Regularization [65.6266768678291]
Federated learning enables joint training of machine learning models from distributed clients without sharing their local data.
One key challenge in federated learning is to handle non-identically distributed data across the clients.
We propose a novel federated learning framework with projected trajectory regularization (FedPTR) for tackling the data issue.
arXiv Detail & Related papers (2023-12-22T02:12:08Z) - Towards a Better Theoretical Understanding of Independent Subnetwork Training [56.24689348875711]
We take a closer theoretical look at Independent Subnetwork Training (IST)
IST is a recently proposed and highly effective technique for solving the aforementioned problems.
We identify fundamental differences between IST and alternative approaches, such as distributed methods with compressed communication.
arXiv Detail & Related papers (2023-06-28T18:14:22Z) - Physics-enhanced deep surrogates for partial differential equations [30.731686639510517]
We present a "physics-enhanced deep-surrogate" ("PEDS") approach towards developing fast surrogate models for complex physical systems.
Specifically, a combination of a low-fidelity, explainable physics simulator and a neural network generator is proposed, which is trained end-to-end to globally match the output of an expensive high-fidelity numerical solver.
arXiv Detail & Related papers (2021-11-10T18:43:18Z) - Large-scale Neural Solvers for Partial Differential Equations [48.7576911714538]
Solving partial differential equations (PDE) is an indispensable part of many branches of science as many processes can be modelled in terms of PDEs.
Recent numerical solvers require manual discretization of the underlying equation as well as sophisticated, tailored code for distributed computing.
We examine the applicability of continuous, mesh-free neural solvers for partial differential equations, physics-informed neural networks (PINNs)
We discuss the accuracy of GatedPINN with respect to analytical solutions -- as well as state-of-the-art numerical solvers, such as spectral solvers.
arXiv Detail & Related papers (2020-09-08T13:26:51Z) - Deep Generative Models that Solve PDEs: Distributed Computing for
Training Large Data-Free Models [25.33147292369218]
Recent progress in scientific machine learning (SciML) has opened up the possibility of training novel neural network architectures that solve complex partial differential equations (PDEs)
Here we report on a software framework for data parallel distributed deep learning that resolves the twin challenges of training these large SciML models.
Our framework provides several out of the box functionality including (a) loss integrity independent of number of processes, (b) synchronized batch normalization, and (c) distributed higher-order optimization methods.
arXiv Detail & Related papers (2020-07-24T22:42:35Z) - Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks.
We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.