MOFA: Discovering Materials for Carbon Capture with a GenAI- and Simulation-Based Workflow
- URL: http://arxiv.org/abs/2501.10651v1
- Date: Sat, 18 Jan 2025 04:10:44 GMT
- Title: MOFA: Discovering Materials for Carbon Capture with a GenAI- and Simulation-Based Workflow
- Authors: Xiaoli Yan, Nathaniel Hudson, Hyun Park, Daniel Grzenda, J. Gregory Pauloski, Marcus Schwarting, Haochen Pan, Hassan Harb, Samuel Foreman, Chris Knight, Tom Gibbs, Kyle Chard, Santanu Chaudhuri, Emad Tajkhorshid, Ian Foster, Mohamad Moosavi, Logan Ward, E. A. Huerta,
- Abstract summary: MOFA is an open-source generative AI (GenAI) plus simulation workflow for high- throughput generation of metal-organic frameworks (MOFs)
MOFA addresses key challenges in integrating GPU-accelerated computing for GenAI tasks, including distributed training and inference, alongside CPU- and GPU-optimized tasks for screening and filtering AI-generated MOFs.
- Score: 5.310696264367485
- License:
- Abstract: We present MOFA, an open-source generative AI (GenAI) plus simulation workflow for high-throughput generation of metal-organic frameworks (MOFs) on large-scale high-performance computing (HPC) systems. MOFA addresses key challenges in integrating GPU-accelerated computing for GPU-intensive GenAI tasks, including distributed training and inference, alongside CPU- and GPU-optimized tasks for screening and filtering AI-generated MOFs using molecular dynamics, density functional theory, and Monte Carlo simulations. These heterogeneous tasks are unified within an online learning framework that optimizes the utilization of available CPU and GPU resources across HPC systems. Performance metrics from a 450-node (14,400 AMD Zen 3 CPUs + 1800 NVIDIA A100 GPUs) supercomputer run demonstrate that MOFA achieves high-throughput generation of novel MOF structures, with CO$_2$ adsorption capacities ranking among the top 10 in the hypothetical MOF (hMOF) dataset. Furthermore, the production of high-quality MOFs exhibits a linear relationship with the number of nodes utilized. The modular architecture of MOFA will facilitate its integration into other scientific applications that dynamically combine GenAI with large-scale simulations.
Related papers
- GauSim: Registering Elastic Objects into Digital World by Gaussian Simulator [55.02281855589641]
GauSim is a novel neural network-based simulator designed to capture the dynamic behaviors of real-world elastic objects represented through Gaussian kernels.
We leverage continuum mechanics, modeling each kernel as a continuous piece of matter to account for realistic deformations without idealized assumptions.
GauSim incorporates explicit physics constraints, such as mass and momentum conservation, ensuring interpretable results and robust, physically plausible simulations.
arXiv Detail & Related papers (2024-12-23T18:58:17Z) - BioNeMo Framework: a modular, high-performance library for AI model development in drug discovery [66.97700597098215]
We introduce the BioNeMo Framework to facilitate the training of computational biology and chemistry AI models.
On 256 NVIDIA A100s, BioNeMo Framework trains a three billion parameter BERT-based pLM on over one trillion tokens in 4.2 days.
The BioNeMo Framework is open-source and free for everyone to use.
arXiv Detail & Related papers (2024-11-15T19:46:16Z) - EPS-MoE: Expert Pipeline Scheduler for Cost-Efficient MoE Inference [49.94169109038806]
This paper introduces EPS-MoE, a novel expert pipeline scheduler for MoE that surpasses the existing parallelism schemes.
Our results demonstrate at most 52.4% improvement in prefill throughput compared to existing parallel inference methods.
arXiv Detail & Related papers (2024-10-16T05:17:49Z) - Neural Architecture Search of Hybrid Models for NPU-CIM Heterogeneous AR/VR Devices [10.75997684204274]
We introduce H4H-NAS, a Neural Architecture Search framework to design efficient hybrid CNN/ViT models for heterogeneous edge systems.
Results from our Algo/HW co-design reveal up to 56.08% overall latency and 41.72% energy improvements.
arXiv Detail & Related papers (2024-10-10T19:30:34Z) - FeNNol: an Efficient and Flexible Library for Building Force-field-enhanced Neural Network Potentials [0.0]
We present FeNNol, a new library for building, training and running force-field-enhanced neural network potentials.
It provides a flexible and modular system for building hybrid models.
It is demonstrated with the popular ANI-2x model reaching simulation speeds nearly on par with the AMOEBA polarizable force-field.
arXiv Detail & Related papers (2024-05-02T17:25:32Z) - A Heterogeneous Parallel Non-von Neumann Architecture System for
Accurate and Efficient Machine Learning Molecular Dynamics [9.329011150399726]
This paper proposes a special-purpose system to achieve high-accuracy and high-efficiency machine learning (ML) calculations.
The system consists of field programmable gate array (FPGA) and application specific integrated circuit (ASIC) working in heterogeneous parallelization.
arXiv Detail & Related papers (2023-03-26T05:43:49Z) - Multi-fidelity Hierarchical Neural Processes [79.0284780825048]
Multi-fidelity surrogate modeling reduces the computational cost by fusing different simulation outputs.
We propose Multi-fidelity Hierarchical Neural Processes (MF-HNP), a unified neural latent variable model for multi-fidelity surrogate modeling.
We evaluate MF-HNP on epidemiology and climate modeling tasks, achieving competitive performance in terms of accuracy and uncertainty estimation.
arXiv Detail & Related papers (2022-06-10T04:54:13Z) - NNP/MM: Accelerating molecular dynamics simulations with machine
learning potentials and molecular mechanic [38.50309739333058]
We introduce an optimized implementation of the hybrid method (NNP/MM), which combines neural network potentials (NNP) and molecular mechanics (MM)
This approach models a portion of the system, such as a small molecule, using NNP while employing MM for the remaining system to boost efficiency.
It has enabled us to increase the simulation speed by 5 times and achieve a combined sampling of one microsecond for each complex, marking the longest simulations ever reported for this class of simulation.
arXiv Detail & Related papers (2022-01-20T10:57:20Z) - Using Machine Learning at Scale in HPC Simulations with SmartSim: An
Application to Ocean Climate Modeling [52.77024349608834]
We demonstrate the first climate-scale, numerical ocean simulations improved through distributed, online inference of Deep Neural Networks (DNN) using SmartSim.
SmartSim is a library dedicated to enabling online analysis and Machine Learning (ML) for traditional HPC simulations.
arXiv Detail & Related papers (2021-04-13T19:27:28Z) - Achieving 100X faster simulations of complex biological phenomena by
coupling ML to HPC ensembles [47.44377051031385]
We present DeepDriveMD, a tool for a range of prototypical ML-driven HPC simulation scenarios.
We use it to quantify improvements in the scientific performance of ML-driven ensemble-based applications.
arXiv Detail & Related papers (2021-04-10T15:52:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.