In Situ Framework for Coupling Simulation and Machine Learning with
Application to CFD
- URL: http://arxiv.org/abs/2306.12900v1
- Date: Thu, 22 Jun 2023 14:07:54 GMT
- Title: In Situ Framework for Coupling Simulation and Machine Learning with
Application to CFD
- Authors: Riccardo Balin and Filippo Simini and Cooper Simpson and Andrew Shao
and Alessandro Rigazzi and Matthew Ellis and Stephen Becker and Alireza
Doostan and John A. Evans and Kenneth E. Jansen
- Abstract summary: Recent years have seen many successful applications of machine learning (ML) to facilitate fluid dynamic computations.
As simulations grow, generating new training datasets for traditional offline learning creates I/O and storage bottlenecks.
This work offers a solution by simplifying this coupling and enabling in situ training and inference on heterogeneous clusters.
- Score: 51.04126395480625
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent years have seen many successful applications of machine learning (ML)
to facilitate fluid dynamic computations. As simulations grow, generating new
training datasets for traditional offline learning creates I/O and storage
bottlenecks. Additionally, performing inference at runtime requires non-trivial
coupling of ML framework libraries with simulation codes. This work offers a
solution to both limitations by simplifying this coupling and enabling in situ
training and inference workflows on heterogeneous clusters. Leveraging
SmartSim, the presented framework deploys a database to store data and ML
models in memory, thus circumventing the file system. On the Polaris
supercomputer, we demonstrate perfect scaling efficiency to the full machine
size of the data transfer and inference costs thanks to a novel co-located
deployment of the database. Moreover, we train an autoencoder in situ from a
turbulent flow simulation, showing that the framework overhead is negligible
relative to a solver time step and training epoch.
Related papers
- Combining Machine Learning with Computational Fluid Dynamics using OpenFOAM and SmartSim [39.58317527488534]
We provide an effective and scalable solution to developing CFD+ML algorithms using OpenFOAM and SmartSim.
SmartSim provides an Orchestrator that significantly simplifies the programming of CFD+ML algorithms and a Redis database.
We show how to leverage SmartSim to effectively couple different segments of OpenFOAM with ML, including pre/post-processing applications, solvers, function objects, and mesh motion solvers.
arXiv Detail & Related papers (2024-02-25T20:39:44Z) - Continual learning autoencoder training for a particle-in-cell
simulation via streaming [52.77024349608834]
upcoming exascale era will provide a new generation of physics simulations with high resolution.
These simulations will have a high resolution, which will impact the training of machine learning models since storing a high amount of simulation data on disk is nearly impossible.
This work presents an approach that trains a neural network concurrently to a running simulation without data on a disk.
arXiv Detail & Related papers (2022-11-09T09:55:14Z) - Simulation-Based Parallel Training [55.41644538483948]
We present our ongoing work to design a training framework that alleviates those bottlenecks.
It generates data in parallel with the training process.
We present a strategy to mitigate this bias with a memory buffer.
arXiv Detail & Related papers (2022-11-08T09:31:25Z) - Multi-Edge Server-Assisted Dynamic Federated Learning with an Optimized
Floating Aggregation Point [51.47520726446029]
cooperative edge learning (CE-FL) is a distributed machine learning architecture.
We model the processes taken during CE-FL, and conduct analytical training.
We show the effectiveness of our framework with the data collected from a real-world testbed.
arXiv Detail & Related papers (2022-03-26T00:41:57Z) - Asynchronous Parallel Incremental Block-Coordinate Descent for
Decentralized Machine Learning [55.198301429316125]
Machine learning (ML) is a key technique for big-data-driven modelling and analysis of massive Internet of Things (IoT) based intelligent and ubiquitous computing.
For fast-increasing applications and data amounts, distributed learning is a promising emerging paradigm since it is often impractical or inefficient to share/aggregate data.
This paper studies the problem of training an ML model over decentralized systems, where data are distributed over many user devices.
arXiv Detail & Related papers (2022-02-07T15:04:15Z) - SOLIS -- The MLOps journey from data acquisition to actionable insights [62.997667081978825]
In this paper we present a unified deployment pipeline and freedom-to-operate approach that supports all requirements while using basic cross-platform tensor framework and script language engines.
This approach however does not supply the needed procedures and pipelines for the actual deployment of machine learning capabilities in real production grade systems.
arXiv Detail & Related papers (2021-12-22T14:45:37Z) - Federated Learning for Hybrid Beamforming in mm-Wave Massive MIMO [12.487990897680422]
We introduce a federated learning (FL) based framework for hybrid beamforming, where the model training is performed at the base station.
We design a convolutional neural network, in which the input is the channel data, yielding the analog beamformers at the output.
FL is demonstrated to be more tolerant to the imperfections and corruptions in the channel data as well as having less transmission overhead than CML.
arXiv Detail & Related papers (2020-05-20T11:21:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.