Benchmarking Quantum Red TEA on CPUs, GPUs, and TPUs
- URL: http://arxiv.org/abs/2409.03818v1
- Date: Thu, 5 Sep 2024 18:00:01 GMT
- Title: Benchmarking Quantum Red TEA on CPUs, GPUs, and TPUs
- Authors: Daniel Jaschke, Marco Ballarin, Nora Reinić, Luka Pavešić, Simone Montangero,
- Abstract summary: We compare different linear algebra backends, e.g., numpy versus the torch, jax, or tensorflow library, as well as a mixed-precision-inspired approach and optimizations for the target hardware.
We present a way to obtain speedups of a factor of 34 when tuning parameters on the CPU, and an additional factor of 2.76 on top of the best CPU setup when migrating to GPUs.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We benchmark simulations of many-body quantum systems on heterogeneous hardware platforms using CPUs, GPUs, and TPUs. We compare different linear algebra backends, e.g., numpy versus the torch, jax, or tensorflow library, as well as a mixed-precision-inspired approach and optimizations for the target hardware. Quantum red TEA out of the Quantum TEA library specifically addresses handling tensors with different libraries or hardware, where the tensors are the building block of tensor network algorithms. The benchmark problem is a variational search of a ground state in an interacting model. This is a ubiquitous problem in quantum many-body physics, which we solve using tensor network methods. This approximate state-of-the-art method compresses quantum correlations which is key to overcoming the exponential growth of the Hilbert space as a function of the number of particles. We present a way to obtain speedups of a factor of 34 when tuning parameters on the CPU, and an additional factor of 2.76 on top of the best CPU setup when migrating to GPUs.
Related papers
- TensorQC: Towards Scalable Distributed Quantum Computing via Tensor Networks [16.609478015737707]
A quantum processing unit (QPU) must contain a large number of high quality qubits to produce accurate results.
Most scientific and industry classical computation workloads happen in parallel on distributed systems.
This paper demonstrates running benchmarks that are otherwise intractable for a standalone QPU and prior circuit cutting techniques.
arXiv Detail & Related papers (2025-02-05T18:42:07Z) - GPU-accelerated Effective Hamiltonian Calculator [70.12254823574538]
We present numerical techniques inspired by Nonperturbative Analytical Diagonalization (NPAD) and the Magnus expansion for the efficient calculation of effective Hamiltonians.
Our numerical techniques are available as an open-source Python package, $rm qCH_eff$.
arXiv Detail & Related papers (2024-11-15T06:33:40Z) - 3D-QAE: Fully Quantum Auto-Encoding of 3D Point Clouds [71.39129855825402]
Existing methods for learning 3D representations are deep neural networks trained and tested on classical hardware.
This paper introduces the first quantum auto-encoder for 3D point clouds.
arXiv Detail & Related papers (2023-11-09T18:58:33Z) - Performance Evaluation and Acceleration of the QTensor Quantum Circuit
Simulator on GPUs [6.141912076989479]
We implement NumPy, PyTorch, and CuPy backends and benchmark the codes to find the optimal allocation of tensor simulations to either a CPU or a GPU.
Our method achieves $176times$ speedup on a GPU over the NumPy baseline on a CPU for the benchmarked QAOA circuits to solve MaxCut problem.
arXiv Detail & Related papers (2022-04-12T19:03:44Z) - RosneT: A Block Tensor Algebra Library for Out-of-Core Quantum Computing
Simulation [0.18472148461613155]
We present RosneT, a library for distributed, out-of-core block tensor algebra.
We use the PyCOMPSs programming model to transform tensor operations into a collection of tasks handled by the COMPSs runtime.
We report results validating our approach showing good scalability in simulations of Quantum circuits of up to 53 qubits.
arXiv Detail & Related papers (2022-01-17T20:35:40Z) - TensorLy-Quantum: Quantum Machine Learning with Tensor Methods [67.29221827422164]
We create a Python library for quantum circuit simulation that adopts the PyTorch API.
Ly-Quantum can scale to hundreds of qubits on a single GPU and thousands of qubits on multiple GPU.
arXiv Detail & Related papers (2021-12-19T19:26:17Z) - Adaptive Elastic Training for Sparse Deep Learning on Heterogeneous
Multi-GPU Servers [65.60007071024629]
We show that Adaptive SGD outperforms four state-of-the-art solutions in time-to-accuracy.
We show experimentally that Adaptive SGD outperforms four state-of-the-art solutions in time-to-accuracy.
arXiv Detail & Related papers (2021-10-13T20:58:15Z) - Fast quantum circuit simulation using hardware accelerated general
purpose libraries [69.43216268165402]
CuPy is a general purpose library (linear algebra) developed specifically for GPU-based quantum circuits.
For supremacy circuits the speedup is around 2x, and for quantum multipliers almost 22x compared to state-of-the-art C++-based simulators.
arXiv Detail & Related papers (2021-06-26T10:41:43Z) - Hybrid Models for Learning to Branch [81.93868699246214]
We propose a new hybrid architecture for efficient branching on CPU machines.
The proposed architecture combines the expressive power of GNNs with computationally inexpensive multi-layer perceptrons (MLP) for branching.
arXiv Detail & Related papers (2020-06-26T21:03:45Z) - Kernel Operations on the GPU, with Autodiff, without Memory Overflows [5.669790037378094]
The KeOps library provides a fast and memory-efficient GPU support for tensors whose entries are given by a mathematical formula.
KeOps alleviates the major bottleneck of tensor-centric libraries for kernel and geometric applications: memory consumption.
KeOps combines optimized C++/CUDA schemes with binders for high-level languages: Python (Numpy and PyTorch), Matlab and R.
arXiv Detail & Related papers (2020-03-27T08:54:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.