Enabling Multi-threading in Heterogeneous Quantum-Classical Programming
Models
- URL: http://arxiv.org/abs/2301.11559v1
- Date: Fri, 27 Jan 2023 06:48:37 GMT
- Title: Enabling Multi-threading in Heterogeneous Quantum-Classical Programming
Models
- Authors: Akihiro Hayashi, Austin Adams, Jeffrey Young, Alexander McCaskey,
Eugene Dumitrescu, Vivek Sarkar, Thomas M. Conte
- Abstract summary: We introduce C++-based parallel constructs to enable parallel execution of a quantum kernel.
Preliminary performance results show that running two Bell kernels with 12 threads per kernel in parallel outperforms running the kernels one after the other.
- Score: 53.937052213390736
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we address some of the key limitations to realizing a generic
heterogeneous parallel programming model for quantum-classical heterogeneous
platforms. We discuss our experience in enabling user-level multi-threading in
QCOR as well as challenges that need to be addressed for programming future
quantum-classical systems. Specifically, we discuss our design and
implementation of introducing C++-based parallel constructs to enable 1)
parallel execution of a quantum kernel with std::thread and 2) asynchronous
execution with std::async. To do so, we provide a detailed overview of the
current implementation of the QCOR programming model and runtime, and discuss
how we add 1) thread-safety to some of its user-facing API routines, and 2)
increase parallelism in QCOR by removing data races that inhibit
multi-threading so as to better utilize available computing resources. We also
present preliminary performance results with the Quantum++ back end on a
single-node Ryzen9 3900X machine that has 12 physical cores (24 hardware
threads) with 128GB of RAM. The results show that running two Bell kernels with
12 threads per kernel in parallel outperforms running the kernels one after the
other each with 24 threads (1.63x improvement). In addition, we observe the
same trend when running two Shor's algorthm kernels in parallel (1.22x faster
than executing the kernels one after the other). It is worth noting that the
trends remain the same even when we only use physical cores instead of threads.
We believe that our design, implementation, and results will open up an
opportunity not only for 1) enabling quicker prototyping of
parallel/asynchrony-aware quantum-classical algorithms on quantum circuit
simulators in the short-term, but also for 2) realizing a generic heterogeneous
parallel programming model for quantum-classical heterogeneous platforms in the
long-term.
Related papers
- Specx: a C++ task-based runtime system for heterogeneous distributed architectures [0.0]
Specx is a task-based runtime system written in modern C++.
We present Specx, a task-based runtime system written in modern C++.
arXiv Detail & Related papers (2023-08-30T11:41:30Z) - QParallel: Explicit Parallelism for Programming Quantum Computers [62.10004571940546]
We present a language extension for parallel quantum programming.
QParallel removes ambiguities concerning parallelism in current quantum programming languages.
We introduce a tool that guides programmers in the placement of parallel regions by identifying the subroutines that profit most from parallelization.
arXiv Detail & Related papers (2022-10-07T16:35:16Z) - How Parallel Circuit Execution Can Be Useful for NISQ Computing? [0.0]
Quantum computing is performed on Noisy Intermediate-Scale Quantum (NISQ) hardware in the short term.
Only small circuits can be executed reliably on a quantum machine due to the unavoidable noisy quantum operations on NISQ devices.
A parallel circuit execution technique has been proposed to address this problem by executing multiple programs on hardware simultaneously.
arXiv Detail & Related papers (2021-12-01T10:12:35Z) - Fast quantum circuit simulation using hardware accelerated general
purpose libraries [69.43216268165402]
CuPy is a general purpose library (linear algebra) developed specifically for GPU-based quantum circuits.
For supremacy circuits the speedup is around 2x, and for quantum multipliers almost 22x compared to state-of-the-art C++-based simulators.
arXiv Detail & Related papers (2021-06-26T10:41:43Z) - Accelerating variational quantum algorithms with multiple quantum
processors [78.36566711543476]
Variational quantum algorithms (VQAs) have the potential of utilizing near-term quantum machines to gain certain computational advantages.
Modern VQAs suffer from cumbersome computational overhead, hampered by the tradition of employing a solitary quantum processor to handle large data.
Here we devise an efficient distributed optimization scheme, called QUDIO, to address this issue.
arXiv Detail & Related papers (2021-06-24T08:18:42Z) - Extending C++ for Heterogeneous Quantum-Classical Computing [56.782064931823015]
qcor is a language extension to C++ and compiler implementation that enables heterogeneous quantum-classical programming, compilation, and execution in a single-source context.
Our work provides a first-of-its-kind C++ compiler enabling high-level quantum kernel (function) expression in a quantum-language manner.
arXiv Detail & Related papers (2020-10-08T12:49:07Z) - Multi-threaded Memory Efficient Crossover in C++ for Generational
Genetic Programming [0.0]
C++ snippets from a multi-core parallel memory-efficient crossover for genetic programming are given.
They may be adapted for separate generation evolutionary algorithms where large chromosomes or small RAM require no more than M + (2 times nthreads) simultaneously active individuals.
arXiv Detail & Related papers (2020-09-22T11:32:20Z) - Quantum Fan-out: Circuit Optimizations and Technology Modeling [3.4827330067784295]
We introduce a simultaneous fan-out primitive to optimize circuit synthesis for NISQ workloads.
We also introduce novel quantum memory architectures based on fan-out.
We demonstrate experimental proof-of-concept of fan-out with superconducting qubits.
arXiv Detail & Related papers (2020-07-08T16:38:07Z) - Parallelising the Queries in Bucket Brigade Quantum RAM [69.43216268165402]
Quantum algorithms often use quantum RAMs (QRAM) for accessing information stored in a database-like manner.
We show a systematic method to significantly reduce the effective query time by using Clifford+T gate parallelism.
We conclude that, in theory, fault-tolerant bucket brigade quantum RAM queries can be performed approximately with the speed of classical RAM.
arXiv Detail & Related papers (2020-02-21T14:50:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.