Density Matrix Renormalization Group with Tensor Processing Units
- URL: http://arxiv.org/abs/2204.05693v1
- Date: Tue, 12 Apr 2022 10:40:14 GMT
- Title: Density Matrix Renormalization Group with Tensor Processing Units
- Authors: Martin Ganahl, Jackson Beall, Markus Hauru, Adam G. M. Lewis, Jae
Hyeon Yoo, Yijian Zou, Guifre Vidal
- Abstract summary: Google's Processing Units (TPUs) are integrated circuits specifically built to accelerate and scale up machine learning workloads.
In this work we demonstrate the use of TPUs for accelerating and scaling up the density matrix renormalization group (DMRG), a powerful numerical approach to compute the ground state of a local quantum many-body Hamiltonian.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Google's Tensor Processing Units (TPUs) are integrated circuits specifically
built to accelerate and scale up machine learning workloads. They can perform
fast distributed matrix multiplications and therefore be repurposed for other
computationally intensive tasks. In this work we demonstrate the use of TPUs
for accelerating and scaling up the density matrix renormalization group
(DMRG), a powerful numerical approach to compute the ground state of a local
quantum many-body Hamiltonian. The cost of DMRG scales with system size $N$ as
$O(ND^3)$, where the so-called bond dimension $D$ regulates how expressive the
underlying matrix product state (MPS) variational ansatz is. We consider
lattice models in two spatial dimensions, with square lattices of size
$10\times 10$ (free fermions) and $20\times 20$ (transverse field Ising model),
for which the required MPS bond dimension is known to scale at least as
$\exp(\sqrt{N})$. Using half of a TPU v3 pod (namely $1,\!024$ TPU v3 cores) we
reached an unprecedentedly large bond dimension $D = 2^{16} = 65,\!536$, for
which optimizing a single MPS tensor took about 2 minutes.
Related papers
- Optimized Quantum Simulation Algorithms for Scalar Quantum Field Theories [0.3394351835510634]
We provide practical simulation methods for scalar field theories on a quantum computer that yield improveds.
We implement our approach using a series of different fault-tolerant simulation algorithms for Hamiltonians.
We find in both cases that the bounds suggest physically meaningful simulations can be performed using on the order of $4times 106$ physical qubits and $1012$ $T$-gates.
arXiv Detail & Related papers (2024-07-18T18:00:01Z) - Compute Better Spent: Replacing Dense Layers with Structured Matrices [77.61728033234233]
We identify more efficient alternatives to dense matrices, as exemplified by the success of convolutional networks in the image domain.
We show that different structures often require drastically different initialization scales and learning rates, which are crucial to performance.
We propose a novel matrix family containing Monarch matrices, the Block-Train, which we show performs better than dense for the same compute on multiple tasks.
arXiv Detail & Related papers (2024-06-10T13:25:43Z) - A distributed multi-GPU ab initio density matrix renormalization group
algorithm with applications to the P-cluster of nitrogenase [1.7444066202370399]
We present the first distributed multi- GPU (Graphics Processing Unit) emphab initio density matrix renormalization (DMRG) algorithm.
We are able to reach an unprecedentedly large bond dimension $D=14000$ on 48 GPU.
This is nearly three times larger than the bond dimensions reported in previous DMRG calculations for the same system using only CPUs.
arXiv Detail & Related papers (2023-11-06T04:01:26Z) - Fully $1\times1$ Convolutional Network for Lightweight Image
Super-Resolution [79.04007257606862]
Deep models have significant process on single image super-resolution (SISR) tasks, in particular large models with large kernel ($3times3$ or more)
$1times1$ convolutions bring substantial computational efficiency, but struggle with aggregating local spatial representations.
We propose a simple yet effective fully $1times1$ convolutional network, named Shift-Conv-based Network (SCNet)
arXiv Detail & Related papers (2023-07-30T06:24:03Z) - On sampling determinantal and Pfaffian point processes on a quantum
computer [49.1574468325115]
DPPs were introduced by Macchi as a model in quantum optics the 1970s.
Most applications require sampling from a DPP, and given their quantum origin, it is natural to wonder whether sampling a DPP on a classical computer is easier than on a classical one.
Vanilla sampling consists in two steps, of respective costs $mathcalO(N3)$ and $mathcalO(Nr2)$ operations on a classical computer, where $r$ is the rank of the kernel matrix.
arXiv Detail & Related papers (2023-05-25T08:43:11Z) - Average-Case Complexity of Tensor Decomposition for Low-Degree
Polynomials [93.59919600451487]
"Statistical-computational gaps" occur in many statistical inference tasks.
We consider a model for random order-3 decomposition where one component is slightly larger in norm than the rest.
We show that tensor entries can accurately estimate the largest component when $ll n3/2$ but fail to do so when $rgg n3/2$.
arXiv Detail & Related papers (2022-11-10T00:40:37Z) - Monarch: Expressive Structured Matrices for Efficient and Accurate
Training [64.6871423399431]
Large neural networks excel in many domains, but they are expensive to train and fine-tune.
A popular approach to reduce their compute or memory requirements is to replace dense weight matrices with structured ones.
We propose a class of matrices (Monarch) that is hardware-efficient.
arXiv Detail & Related papers (2022-04-01T17:37:29Z) - Simulation of quantum physics with Tensor Processing Units: brute-force
computation of ground states and time evolution [0.3232625980782302]
Processing Units (TPUs) were developed by Google exclusively to support large-scale machine learning tasks.
In this paper we repurpose TPUs for the challenging problem of simulating quantum spin systems.
With a TPU v3 pod, with 2048 cores, we simulate wavefunctions $|Psirangle$ of up to $N=38$ qubits.
arXiv Detail & Related papers (2021-11-19T22:41:04Z) - VersaGNN: a Versatile accelerator for Graph neural networks [81.1667080640009]
We propose textitVersaGNN, an ultra-efficient, systolic-array-based versatile hardware accelerator.
textitVersaGNN achieves on average 3712$times$ speedup with 1301.25$times$ energy reduction on CPU, and 35.4$times$ speedup with 17.66$times$ energy reduction on GPU.
arXiv Detail & Related papers (2021-05-04T04:10:48Z) - A scaling hypothesis for projected entangled-pair states [0.0]
We introduce a new paradigm for scaling simulations with projected entangled-pair states (PEPS) for critical strongly-correlated systems.
We use the effective correlation length $chi$ for inducing a collapse of data points, $f(D,chi)=f(xi(D,chi))$, for arbitrary values of $D$ and the environment bond dimension $chi$.
We test our hypothesis on the critical 3-D dimer model, the 3-D classical Ising model, and the 2-D quantum Heisenberg model.
arXiv Detail & Related papers (2021-02-05T12:48:01Z) - Holographic quantum algorithms for simulating correlated spin systems [0.0]
We present a suite of "holographic" quantum algorithms for efficient ground-state preparation and dynamical evolution of correlated spin-systems.
The algorithms exploit the equivalence between matrix-product states (MPS) and quantum channels, along with partial measurement and qubit re-use.
As a demonstration of the potential resource savings, we implement a holoVQE simulation of the antiferromagnetic Heisenberg chain on a trapped-ion quantum computer.
arXiv Detail & Related papers (2020-05-06T18:00:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.