Colmena: Scalable Machine-Learning-Based Steering of Ensemble
Simulations for High Performance Computing
- URL: http://arxiv.org/abs/2110.02827v1
- Date: Wed, 6 Oct 2021 14:56:53 GMT
- Title: Colmena: Scalable Machine-Learning-Based Steering of Ensemble
Simulations for High Performance Computing
- Authors: Logan Ward, Ganesh Sivaraman, J. Gregory Pauloski, Yadu Babuji, Ryan
Chard, Naveen Dandu, Paul C. Redfern, Rajeev S. Assary, Kyle Chard, Larry A.
Curtiss, Rajeev Thakur, Ian Foster
- Abstract summary: We present Colmena, an open-source Python framework that allows users to steer campaigns by providing just the implementations of individual tasks.
Colmena handles task dispatch, results collation, ML model invocation, and ML model (re)training, using Parsl to execute tasks on HPC systems.
We describe the design of Colmena and illustrate its capabilities by applying it to electrolyte design, where it both scales to 65536 CPUs and accelerates the discovery rate for high-performance molecules by a factor of 100 over unguided searches.
- Score: 3.5604179670745237
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Scientific applications that involve simulation ensembles can be accelerated
greatly by using experiment design methods to select the best simulations to
perform. Methods that use machine learning (ML) to create proxy models of
simulations show particular promise for guiding ensembles but are challenging
to deploy because of the need to coordinate dynamic mixes of simulation and
learning tasks. We present Colmena, an open-source Python framework that allows
users to steer campaigns by providing just the implementations of individual
tasks plus the logic used to choose which tasks to execute when. Colmena
handles task dispatch, results collation, ML model invocation, and ML model
(re)training, using Parsl to execute tasks on HPC systems. We describe the
design of Colmena and illustrate its capabilities by applying it to electrolyte
design, where it both scales to 65536 CPUs and accelerates the discovery rate
for high-performance molecules by a factor of 100 over unguided searches.
Related papers
- Simulation Streams: A Programming Paradigm for Controlling Large Language Models and Building Complex Systems with Generative AI [3.3126968968429407]
Simulation Streams is a programming paradigm designed to efficiently control and leverage Large Language Models (LLMs)
Our primary goal is to create a framework that harnesses the agentic abilities of LLMs while addressing their limitations in maintaining consistency.
arXiv Detail & Related papers (2025-01-30T16:38:03Z) - LLM Agent for Fire Dynamics Simulations [3.0031348283981987]
FoamPilot is a proof-of-concept agent designed to enhance the usability of FireFOAM.
FireFOAM is a solver for fire dynamics and fire suppression simulations built using OpenFOAM.
FoamPilot provides three core functionalities: code insight, case configuration and simulation evaluation.
arXiv Detail & Related papers (2024-12-22T20:03:35Z) - DrEureka: Language Model Guided Sim-To-Real Transfer [64.14314476811806]
Transferring policies learned in simulation to the real world is a promising strategy for acquiring robot skills at scale.
In this paper, we investigate using Large Language Models (LLMs) to automate and accelerate sim-to-real design.
Our approach is capable of solving novel robot tasks, such as quadruped balancing and walking atop a yoga ball.
arXiv Detail & Related papers (2024-06-04T04:53:05Z) - MLatom 3: Platform for machine learning-enhanced computational chemistry
simulations and workflows [12.337972297411003]
Machine learning (ML) is increasingly becoming a common tool in computational chemistry.
MLatom 3 is a program package designed to leverage the power of ML to enhance typical computational chemistry simulations.
The users can choose from an extensive library of methods containing pre-trained ML models and quantum mechanical approximations.
arXiv Detail & Related papers (2023-10-31T03:41:39Z) - Waymax: An Accelerated, Data-Driven Simulator for Large-Scale Autonomous
Driving Research [76.93956925360638]
Waymax is a new data-driven simulator for autonomous driving in multi-agent scenes.
It runs entirely on hardware accelerators such as TPUs/GPUs and supports in-graph simulation for training.
We benchmark a suite of popular imitation and reinforcement learning algorithms with ablation studies on different design decisions.
arXiv Detail & Related papers (2023-10-12T20:49:15Z) - In Situ Framework for Coupling Simulation and Machine Learning with
Application to CFD [51.04126395480625]
Recent years have seen many successful applications of machine learning (ML) to facilitate fluid dynamic computations.
As simulations grow, generating new training datasets for traditional offline learning creates I/O and storage bottlenecks.
This work offers a solution by simplifying this coupling and enabling in situ training and inference on heterogeneous clusters.
arXiv Detail & Related papers (2023-06-22T14:07:54Z) - Multitask Adaptation by Retrospective Exploration with Learned World
Models [77.34726150561087]
We propose a meta-learned addressing model called RAMa that provides training samples for the MBRL agent taken from task-agnostic storage.
The model is trained to maximize the expected agent's performance by selecting promising trajectories solving prior tasks from the storage.
arXiv Detail & Related papers (2021-10-25T20:02:57Z) - Learning Multi-Objective Curricula for Deep Reinforcement Learning [55.27879754113767]
Various automatic curriculum learning (ACL) methods have been proposed to improve the sample efficiency and final performance of deep reinforcement learning (DRL)
In this paper, we propose a unified automatic curriculum learning framework to create multi-objective but coherent curricula.
In addition to existing hand-designed curricula paradigms, we further design a flexible memory mechanism to learn an abstract curriculum.
arXiv Detail & Related papers (2021-10-06T19:30:25Z) - Achieving 100X faster simulations of complex biological phenomena by
coupling ML to HPC ensembles [47.44377051031385]
We present DeepDriveMD, a tool for a range of prototypical ML-driven HPC simulation scenarios.
We use it to quantify improvements in the scientific performance of ML-driven ensemble-based applications.
arXiv Detail & Related papers (2021-04-10T15:52:39Z) - Integrating Machine Learning with HPC-driven Simulations for Enhanced
Student Learning [0.0]
We develop a web application that supports both HPC-driven simulation and the ML surrogate methods to produce simulation outputs.
The evaluation of the tool via in-classroom student feedback and surveys shows that the ML-enhanced tool provides a dynamic and responsive simulation environment.
arXiv Detail & Related papers (2020-08-24T22:48:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.