A distributed multi-GPU ab initio density matrix renormalization group
algorithm with applications to the P-cluster of nitrogenase
- URL: http://arxiv.org/abs/2311.02854v2
- Date: Thu, 21 Dec 2023 12:03:30 GMT
- Title: A distributed multi-GPU ab initio density matrix renormalization group
algorithm with applications to the P-cluster of nitrogenase
- Authors: Chunyang Xiang, Weile Jia, Wei-Hai Fang, Zhendong Li
- Abstract summary: We present the first distributed multi- GPU (Graphics Processing Unit) emphab initio density matrix renormalization (DMRG) algorithm.
We are able to reach an unprecedentedly large bond dimension $D=14000$ on 48 GPU.
This is nearly three times larger than the bond dimensions reported in previous DMRG calculations for the same system using only CPUs.
- Score: 1.7444066202370399
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The presence of many degenerate $d/f$ orbitals makes polynuclear transition
metal compounds such as iron-sulfur clusters in nitrogenase challenging for
state-of-the-art quantum chemistry methods. To address this challenge, we
present the first distributed multi-GPU (Graphics Processing Unit) \emph{ab
initio} density matrix renormalization (DMRG) algorithm, suitable for modern
high-performance computing (HPC) infrastructures. The central idea is to
parallelize the most computationally intensive part - the multiplication of
$O(K^2)$ operators with a trial wavefunction, where $K$ is the number of
spatial orbitals, by combining operator parallelism for distributing the
workload with a batched algorithm for performing contractions on GPU. With this
new implementation, we are able to reach an unprecedentedly large bond
dimension $D=14000$ on 48 GPUs (NVIDIA A100 80 GB SXM) for an active space
model (114 electrons in 73 active orbitals) of the P-cluster, which is nearly
three times larger than the bond dimensions reported in previous DMRG
calculations for the same system using only CPUs.
Related papers
- Neutron-nucleus dynamics simulations for quantum computers [49.369935809497214]
We develop a novel quantum algorithm for neutron-nucleus simulations with general potentials.
It provides acceptable bound-state energies even in the presence of noise, through the noise-resilient training method.
We introduce a new commutativity scheme called distance-grouped commutativity (DGC) and compare its performance with the well-known qubit-commutativity scheme.
arXiv Detail & Related papers (2024-02-22T16:33:48Z) - Two dimensional quantum lattice models via mode optimized hybrid CPU-GPU density matrix renormalization group method [0.0]
We present a hybrid numerical approach to simulate quantum many body problems on two spatial dimensional quantum lattice models.
We demonstrate for the two dimensional spinless fermion model and for the Hubbard model on torus geometry that several orders of magnitude in computational time can be saved.
arXiv Detail & Related papers (2023-11-23T17:07:47Z) - A Quadratic Speedup in Finding Nash Equilibria of Quantum Zero-Sum Games [102.46640028830441]
We introduce the Optimistic Matrix Multiplicative Weights Update (OMMWU) algorithm and establish its average-iterate convergence complexity as $mathcalO(d/epsilon)$ to $epsilon$-Nash equilibria.
This quadratic speed-up sets a new benchmark for computing $epsilon$-Nash equilibria in quantum zero-sum games.
arXiv Detail & Related papers (2023-11-17T20:38:38Z) - Boosting the effective performance of massively parallel tensor network
state algorithms on hybrid CPU-GPU based architectures via non-Abelian
symmetries [0.0]
Non-Abelian symmetry related tensor algebra based on Wigner-Eckhart theorem is fully detached from the conventional tensor network layer.
We have achieved an order of magnitude increase in performance with respect to results reported in arXiv:2305.05581 in terms of computational complexity.
Our solution has an estimated effective performance of 250-500 TFLOPS.
arXiv Detail & Related papers (2023-09-23T07:49:53Z) - On sampling determinantal and Pfaffian point processes on a quantum
computer [49.1574468325115]
DPPs were introduced by Macchi as a model in quantum optics the 1970s.
Most applications require sampling from a DPP, and given their quantum origin, it is natural to wonder whether sampling a DPP on a classical computer is easier than on a classical one.
Vanilla sampling consists in two steps, of respective costs $mathcalO(N3)$ and $mathcalO(Nr2)$ operations on a classical computer, where $r$ is the rank of the kernel matrix.
arXiv Detail & Related papers (2023-05-25T08:43:11Z) - Quantum Clustering with k-Means: a Hybrid Approach [117.4705494502186]
We design, implement, and evaluate three hybrid quantum k-Means algorithms.
We exploit quantum phenomena to speed up the computation of distances.
We show that our hybrid quantum k-Means algorithms can be more efficient than the classical version.
arXiv Detail & Related papers (2022-12-13T16:04:16Z) - Density Matrix Renormalization Group with Tensor Processing Units [0.0]
Google's Processing Units (TPUs) are integrated circuits specifically built to accelerate and scale up machine learning workloads.
In this work we demonstrate the use of TPUs for accelerating and scaling up the density matrix renormalization group (DMRG), a powerful numerical approach to compute the ground state of a local quantum many-body Hamiltonian.
arXiv Detail & Related papers (2022-04-12T10:40:14Z) - Realization of arbitrary doubly-controlled quantum phase gates [62.997667081978825]
We introduce a high-fidelity gate set inspired by a proposal for near-term quantum advantage in optimization problems.
By orchestrating coherent, multi-level control over three transmon qutrits, we synthesize a family of deterministic, continuous-angle quantum phase gates acting in the natural three-qubit computational basis.
arXiv Detail & Related papers (2021-08-03T17:49:09Z) - VersaGNN: a Versatile accelerator for Graph neural networks [81.1667080640009]
We propose textitVersaGNN, an ultra-efficient, systolic-array-based versatile hardware accelerator.
textitVersaGNN achieves on average 3712$times$ speedup with 1301.25$times$ energy reduction on CPU, and 35.4$times$ speedup with 17.66$times$ energy reduction on GPU.
arXiv Detail & Related papers (2021-05-04T04:10:48Z) - Quantum Spectral Clustering [5.414308305392762]
Spectral clustering is a powerful machine learning algorithm for clustering data with non convex or nested structures.
We propose an end-to-end quantum algorithm spectral clustering, extending a number of works in quantum machine learning.
arXiv Detail & Related papers (2020-07-01T07:11:42Z) - Massively parallel quantum chemical density matrix renormalization group
method [0.0]
We present to the best of our knowlegde, the first attempt to exploit the supercomputer platform for quantum chemical density matrix renormalization group (QC-DMRG) calculations.
We have developed the parallel scheme based on the in-house MPI global memory library, which combines operator and symmetry sector parallelisms.
In case of the largest calculation, which is the nitrogenase FeMo cofactor cluster with the active space comprising 113 electrons in 76 orbitals and bond dimension equal to 6000, our parallel approach scales up to approximately 2000 CPU cores.
arXiv Detail & Related papers (2020-01-14T16:51:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.