pyRMG: A Python Framework for High-Throughput, Large-Cell Real-Space MultiGrid DFT Calculations
- URL: http://arxiv.org/abs/2509.16775v1
- Date: Sat, 20 Sep 2025 18:46:53 GMT
- Title: pyRMG: A Python Framework for High-Throughput, Large-Cell Real-Space MultiGrid DFT Calculations
- Authors: R. J. Morelock, S. Bagchi, E. L. Briggs, W. Lu, J. Bernholc, P. Ganesh,
- Abstract summary: pyRMG is a Python package designed to streamline setup and execution of density functional theory calculations.<n>We show that pyRMG can converge challenging RMG-based stability and stability with limited user intervention.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Computational materials science has evolved toward materials informatics, where large datasets of complex, multispecies compounds are generated and evaluated using density functional theory (DFT). Materials genome projects mine these datasets for candidates with breakthrough properties, but existing databases remain limited to compounds with relatively small unit cells due to computational cost. Exascale computers now provide the power to simulate larger and more chemically realistic systems, but fully realizing this potential requires DFT codes that can efficiently scale to thousands of processors. Our real-space multigrid (RMG) DFT code's grid-decomposition approach scales nearly linearly with the number of GPUs, even for simulations exceeding thousands of atoms. This scalability makes RMG a compelling tool for high-throughput DFT studies of materials that would otherwise be bottlenecked in other codes (for example, by global Fast Fourier Transforms in plane-wave DFT). In this work, we present pyRMG, a Python package designed to streamline the setup and execution of RMG DFT calculations. Built on the pymatgen and ASE Python packages, pyRMG automates input generation and convergence checking, and integrates with modern job schedulers (e.g., Flux) on leadership-class platforms such as Frontier and Perlmutter. We demonstrate pyRMG for a high-throughput study of interfacial strain and twist-angle effects in lattice-matched, 2D Bi$_2$Se$_3$/NbSe$_2$ heterostructures, which form large, anisotropic supercells. Our results link strain and twist angle to material informatics properties, including stability and band gap, and show that pyRMG can initialize and converge challenging RMG-based workflows with limited user intervention.
Related papers
- AGAPI-Agents: An Open-Access Agentic AI Platform for Accelerated Materials Design on AtomGPT.org [0.8093011368737527]
AGAPI (AtomGPT.org API) is an open-access agentic AI platform that integrates more than eight open-sources with over twenty materials-science API endpoints.<n>We demonstrate AGAPI through end-to-end construction, including heterostructure construction, powder X-ray diffraction analysis, and semiconductor defect engineering.<n>With more than 1,000 active users, AGAPI provides a scalable and transparent foundation for reproducible, AI-accelerated materials discovery.
arXiv Detail & Related papers (2025-12-12T06:28:28Z) - Algorithms and Scientific Software for Quasi-Monte Carlo, Fast Gaussian Process Regression, and Scientific Machine Learning [0.0]
This thesis unifies our developments in three broad domains: Quasi-Monte Carlo (QMC) methods for efficient high-dimensional integration, Gaussian process (GP) regression for high-dimensional domains with built-in uncertainty, and scientific machine learning (sciML) for modeling partial differential equations (PDEs) with mesh-free solvers.
arXiv Detail & Related papers (2025-11-26T21:11:08Z) - Facet: highly efficient E(3)-equivariant networks for interatomic potentials [6.741915610607818]
Computational materials discovery is limited by the high cost of first-principles calculations.<n>Machine learning potentials that predict energies from crystal structures are promising, but existing methods face computational bottlenecks.<n>We present Facet, a GNN architecture for efficient ML potentials.
arXiv Detail & Related papers (2025-09-10T09:06:24Z) - pyFAST: A Modular PyTorch Framework for Time Series Modeling with Multi-source and Sparse Data [10.949140998070732]
pyFAST is a research-oriented PyTorch framework for time series analysis.<n>Its data engine is engineered for complex scenarios, supporting multi-source loading, protein sequence handling, efficient sequence- and patch-level padding, dynamic normalization, and mask-based modeling.<n>Released under the MIT license at GitHub, pyFAST provides a compact yet powerful platform for advancing time series research and applications.
arXiv Detail & Related papers (2025-08-26T10:05:47Z) - Transolver++: An Accurate Neural Solver for PDEs on Million-Scale Geometries [67.63077028746191]
Transolver++ is a highly parallel and efficient neural solver that can solve PDEs on million-scale geometries.<n>Transolver++ increases the single- GPU input capacity to million-scale points for the first time.<n>It achieves over 20% performance gain in million-scale high-fidelity industrial simulations.
arXiv Detail & Related papers (2025-02-04T15:33:50Z) - Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows? [73.81908518992161]
We introduce Spider2-V, the first multimodal agent benchmark focusing on professional data science and engineering.
Spider2-V features real-world tasks in authentic computer environments and incorporating 20 enterprise-level professional applications.
These tasks evaluate the ability of a multimodal agent to perform data-related tasks by writing code and managing the GUI in enterprise data software systems.
arXiv Detail & Related papers (2024-07-15T17:54:37Z) - Generating QM1B with PySCF$_{\text{IPU}}$ [40.29005019051567]
This paper introduces the data generator PySCF$_textIPU$ using Intelligence Processing Units (IPUs)
It allows us to create the dataset QM1B with one billion training examples containing 9-11 heavy atoms.
We highlight several limitations of QM1B and emphasise the low-resolution of our DFT options, which also serves as motivation for even larger, more accurate datasets.
arXiv Detail & Related papers (2023-11-02T10:31:20Z) - MatFormer: Nested Transformer for Elastic Inference [91.45687988953435]
MatFormer is a novel Transformer architecture designed to provide elastic inference across diverse deployment constraints.<n>MatFormer achieves this by incorporating a nested Feed Forward Network (FFN) block structure within a standard Transformer model.<n>We show that a 850M decoder-only MatFormer language model (MatLM) allows us to extract multiple smaller models spanning from 582M to 850M parameters.
arXiv Detail & Related papers (2023-10-11T17:57:14Z) - MB-TaylorFormer: Multi-branch Efficient Transformer Expanded by Taylor
Formula for Image Dehazing [88.61523825903998]
Transformer networks are beginning to replace pure convolutional neural networks (CNNs) in the field of computer vision.
We propose a new Transformer variant, which applies the Taylor expansion to approximate the softmax-attention and achieves linear computational complexity.
We introduce a multi-branch architecture with multi-scale patch embedding to the proposed Transformer, which embeds features by overlapping deformable convolution of different scales.
Our model, named Multi-branch Transformer expanded by Taylor formula (MB-TaylorFormer), can embed coarse to fine features more flexibly at the patch embedding stage and capture long-distance pixel interactions with limited computational cost
arXiv Detail & Related papers (2023-08-27T08:10:23Z) - ClimSim-Online: A Large Multi-scale Dataset and Framework for Hybrid ML-physics Climate Emulation [45.201929285600606]
We present ClimSim-Online, which includes an end-to-end workflow for developing hybrid ML-physics simulators.
The dataset is global and spans ten years at a high sampling frequency.
We provide a cross-platform, containerized pipeline to integrate ML models into operational climate simulators.
arXiv Detail & Related papers (2023-06-14T21:26:31Z) - Scalable training of graph convolutional neural networks for fast and
accurate predictions of HOMO-LUMO gap in molecules [1.8947048356389908]
This work focuses on building GCNN models on HPC systems to predict material properties of millions of molecules.
We use HydraGNN, our in-house library for large-scale GCNN training, leveraging distributed data parallelism in PyTorch.
We perform parallel training on two open-source large-scale graph datasets to build a GCNN predictor for an important quantum property known as the HOMO-LUMO gap.
arXiv Detail & Related papers (2022-07-22T20:54:22Z) - Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge
Graph Completion [112.27103169303184]
Multimodal Knowledge Graphs (MKGs) organize visual-text factual knowledge.
MKGformer can obtain SOTA performance on four datasets of multimodal link prediction, multimodal RE, and multimodal NER.
arXiv Detail & Related papers (2022-05-04T23:40:04Z) - DFTpy: An efficient and object-oriented platform for orbital-free DFT
simulations [55.41644538483948]
In this work, we present DFTpy, an open source software implementing OFDFT written entirely in Python 3.
We showcase the electronic structure of a million-atom system of aluminum metal which was computed on a single CPU.
DFTpy is released under the MIT license.
arXiv Detail & Related papers (2020-02-07T19:07:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.