Parallel time integration using Batched BLAS (Basic Linear Algebra
Subprograms) routines
- URL: http://arxiv.org/abs/2108.07126v1
- Date: Mon, 16 Aug 2021 14:49:04 GMT
- Title: Parallel time integration using Batched BLAS (Basic Linear Algebra
Subprograms) routines
- Authors: Konstantin Herb and Pol Welter
- Abstract summary: We present an approach for integrating the time evolution of quantum systems.
We leverage the computation power of graphics processing units (GPUs) to perform the integration of all time steps in parallel.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We present an approach for integrating the time evolution of quantum systems.
We leverage the computation power of graphics processing units (GPUs) to
perform the integration of all time steps in parallel. The performance boost is
especially prominent for small to medium-sized quantum systems. The devised
algorithm can largely be implemented using the recently-specified batched
versions of the BLAS routines, and can therefore be easily ported to a variety
of platforms. Our PARAllelized Matrix Exponentiation for Numerical Time
evolution (PARAMENT) implementation runs on CUDA-enabled graphics processing
units.
Related papers
- Unlocking Real-Time Fluorescence Lifetime Imaging: Multi-Pixel Parallelism for FPGA-Accelerated Processing [2.369919866595525]
We propose a method to achieve real-time FLI using an FPGA-based hardware accelerator.
We implement a GRU-based sequence-to-sequence (Seq2Seq) model on an FPGA board compatible with time-resolved cameras.
By integrating a GRU-based Seq2Seq model and its compressed version, called Seq2SeqLite, we were able to process multiple pixels in parallel, reducing latency compared to sequential processing.
arXiv Detail & Related papers (2024-10-09T18:24:23Z) - Optimised Hybrid Classical-Quantum Algorithm for Accelerated Solution of Sparse Linear Systems [0.0]
This paper introduces a hybrid classical-quantum algorithm that combines preconditioning techniques with the HHL algorithm to solve sparse linear systems more efficiently.
We show that the proposed approach surpasses traditional methods in speed and scalability but also mitigates some of the inherent limitations of quantum algorithms.
arXiv Detail & Related papers (2024-10-03T11:36:14Z) - Fast, Scalable, Warm-Start Semidefinite Programming with Spectral
Bundling and Sketching [53.91395791840179]
We present Unified Spectral Bundling with Sketching (USBS), a provably correct, fast and scalable algorithm for solving massive SDPs.
USBS provides a 500x speed-up over the state-of-the-art scalable SDP solver on an instance with over 2 billion decision variables.
arXiv Detail & Related papers (2023-12-19T02:27:22Z) - Two dimensional quantum lattice models via mode optimized hybrid CPU-GPU density matrix renormalization group method [0.0]
We present a hybrid numerical approach to simulate quantum many body problems on two spatial dimensional quantum lattice models.
We demonstrate for the two dimensional spinless fermion model and for the Hubbard model on torus geometry that several orders of magnitude in computational time can be saved.
arXiv Detail & Related papers (2023-11-23T17:07:47Z) - Decreasing the Computing Time of Bayesian Optimization using
Generalizable Memory Pruning [56.334116591082896]
We show a wrapper of memory pruning and bounded optimization capable of being used with any surrogate model and acquisition function.
Running BO on high-dimensional or massive data sets becomes intractable due to this time complexity.
All model implementations are run on the MIT Supercloud state-of-the-art computing hardware.
arXiv Detail & Related papers (2023-09-08T14:05:56Z) - Parallel hybrid quantum-classical machine learning for kernelized
time-series classification [0.0]
We tackle with hybrid quantum-classical machine, deducing temporal temporal between pairwise instances using a time-series Hamiltonian (TSHK) algorithm.
Because we treat the kernel weighting step as a differentiable differentiable kernel function, our method can be regarded as an end learnable hybrid quantum-series techniques.
arXiv Detail & Related papers (2023-05-10T04:01:15Z) - GPU-Accelerated Machine Learning in Non-Orthogonal Multiple Access [71.58925117604039]
Non-orthogonal multiple access (NOMA) is an interesting technology that enables massive connectivity as required in future 5G and 6G networks.
We propose a neural network architecture that combines the advantages of both linear and non-linear processing.
arXiv Detail & Related papers (2022-06-13T09:38:23Z) - Efficient GPU implementation of randomized SVD and its applications [17.71779625877989]
Matrix decompositions are ubiquitous in machine learning, including applications in dimensionality data compression and deep learning algorithms.
Typical solutions for matrix decompositions have complexity which significantly increases their computational cost and time.
We leverage efficient processing operations that can be run in parallel on modern Graphical Processing Units (GPUs) to reduce the computational burden of computing matrix decompositions.
arXiv Detail & Related papers (2021-10-05T07:42:41Z) - Providing Meaningful Data Summarizations Using Examplar-based Clustering
in Industry 4.0 [67.80123919697971]
We show, that our GPU implementation provides speedups of up to 72x using single-precision and up to 452x using half-precision compared to conventional CPU algorithms.
We apply our algorithm to real-world data from injection molding manufacturing processes and discuss how found summaries help with steering this specific process to cut costs and reduce the manufacturing of bad parts.
arXiv Detail & Related papers (2021-05-25T15:55:14Z) - Photonic co-processors in HPC: using LightOn OPUs for Randomized
Numerical Linear Algebra [53.13961454500934]
We show that the randomization step for dimensionality reduction may itself become the computational bottleneck on traditional hardware.
We show that randomization can be significantly accelerated, at negligible precision loss, in a wide range of important RandNLA algorithms.
arXiv Detail & Related papers (2021-04-29T15:48:52Z) - Kernel methods through the roof: handling billions of points efficiently [94.31450736250918]
Kernel methods provide an elegant and principled approach to nonparametric learning, but so far could hardly be used in large scale problems.
Recent advances have shown the benefits of a number of algorithmic ideas, for example combining optimization, numerical linear algebra and random projections.
Here, we push these efforts further to develop and test a solver that takes full advantage of GPU hardware.
arXiv Detail & Related papers (2020-06-18T08:16:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.