Related papers: VegasFlow: accelerating Monte Carlo simulation across multiple hardware platforms

VegasFlow: accelerating Monte Carlo simulation across multiple hardware platforms

URL: http://arxiv.org/abs/2002.12921v2
Date: Wed, 20 May 2020 10:52:37 GMT
Title: VegasFlow: accelerating Monte Carlo simulation across multiple hardware platforms
Authors: Stefano Carrazza and Juan M. Cruz-Martinez
Abstract summary: We present VegasFlow, a new software for fast evaluation of high dimensional integrals based on Monte Carlo integration techniques. This software is inspired on the Vegas algorithm, ubiquitous in the particle physics community as the driver cross section integration, and based on Google's powerful library.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present VegasFlow, a new software for fast evaluation of high dimensional integrals based on Monte Carlo integration techniques designed for platforms with hardware accelerators. The growing complexity of calculations and simulations in many areas of science have been accompanied by advances in the computational tools which have helped their developments. VegasFlow enables developers to delegate all complicated aspects of hardware or platform implementation to the library so they can focus on the problem at hand. This software is inspired on the Vegas algorithm, ubiquitous in the particle physics community as the driver of cross section integration, and based on Google's powerful TensorFlow library. We benchmark the performance of this library on many different consumer and professional grade GPUs and CPUs.

Related papers

Introducing GPU-acceleration into the Python-based Simulations of Chemistry Framework [4.368931200886271]
We introduce the first version of GPU4PySCF, a module that provides GPU acceleration of methods in PySCF. Benchmark calculations show a significant speedup of two orders of magnitude with respect to the multi-threaded CPU Hartree-Fock code of PySCF.
arXiv Detail & Related papers (2024-07-12T21:50:19Z)
Enhancing GPU-acceleration in the Python-based Simulations of Chemistry Framework [6.4347138500286665]
We describe our contribution as industrial stakeholders to the existing open-source GPU4PySCF project. We have integrated GPU acceleration into other PySCF functionality including Density Functional Theory (DFT) GPU4PySCF delivers 30 times speedup over a 32-core CPU node, resulting in approximately 90% cost savings for most DFT tasks.
arXiv Detail & Related papers (2024-04-15T04:35:09Z)
DEAP: Design Space Exploration for DNN Accelerator Parallelism [0.0]
Large Language Models (LLMs) are becoming increasingly complex and powerful to train and serve. This paper showcases how hardware and software co-design can come together and allow us to create customized hardware systems.
arXiv Detail & Related papers (2023-12-24T02:43:01Z)
FusionAI: Decentralized Training and Deploying LLMs with Massive Consumer-Level GPUs [57.12856172329322]
We envision a decentralized system unlocking the potential vast untapped consumer-level GPU. This system faces critical challenges, including limited CPU and GPU memory, low network bandwidth, the variability of peer and device heterogeneity.
arXiv Detail & Related papers (2023-09-03T13:27:56Z)
PARTIME: Scalable and Parallel Processing Over Time with Deep Neural Networks [68.96484488899901]
We present PARTIME, a library designed to speed up neural networks whenever data is continuously streamed over time. PARTIME starts processing each data sample at the time in which it becomes available from the stream. Experiments are performed in order to empirically compare PARTIME with classic non-parallel neural computations in online learning.
arXiv Detail & Related papers (2022-10-17T14:49:14Z)
ElegantRL-Podracer: Scalable and Elastic Library for Cloud-Native Deep Reinforcement Learning [141.58588761593955]
We present a library ElegantRL-podracer for cloud-native deep reinforcement learning. It efficiently supports millions of cores to carry out massively parallel training at multiple levels. At a low-level, each pod simulates agent-environment interactions in parallel by fully utilizing nearly 7,000 GPU cores in a single GPU.
arXiv Detail & Related papers (2021-12-11T06:31:21Z)
Accelerating GAN training using highly parallel hardware on public cloud [0.3694429692322631]
This work explores different types of cloud services to train a Geneversarative Adversarial Network (GAN) in a parallel environment. We parallelize the training process on multiple GPU and Google Processing Units (TPU) Linear speed-up of the training process is obtained, while retaining most of the performance in terms of physics results.
arXiv Detail & Related papers (2021-11-08T16:59:15Z)
BayesSimIG: Scalable Parameter Inference for Adaptive Domain Randomization with IsaacGym [59.53949960353792]
BayesSimIG is a library that provides an implementation of BayesSim integrated with the recently released NVIDIA IsaacGym. BayesSimIG provides an integration with NVIDIABoard to easily visualize slices of high-dimensional posteriors.
arXiv Detail & Related papers (2021-07-09T16:21:31Z)
Providing Meaningful Data Summarizations Using Examplar-based Clustering in Industry 4.0 [67.80123919697971]
We show, that our GPU implementation provides speedups of up to 72x using single-precision and up to 452x using half-precision compared to conventional CPU algorithms. We apply our algorithm to real-world data from injection molding manufacturing processes and discuss how found summaries help with steering this specific process to cut costs and reduce the manufacturing of bad parts.
arXiv Detail & Related papers (2021-05-25T15:55:14Z)
Qibo: a framework for quantum simulation with hardware acceleration [0.0]
We present Qibo, a new open-source software for fast evaluation of quantum circuits. We introduce a new quantum simulation framework that enables developers to delegate all complicated aspects of hardware or platform implementation to the library.
arXiv Detail & Related papers (2020-09-03T18:00:01Z)
Kernel methods through the roof: handling billions of points efficiently [94.31450736250918]
Kernel methods provide an elegant and principled approach to nonparametric learning, but so far could hardly be used in large scale problems. Recent advances have shown the benefits of a number of algorithmic ideas, for example combining optimization, numerical linear algebra and random projections. Here, we push these efforts further to develop and test a solver that takes full advantage of GPU hardware.
arXiv Detail & Related papers (2020-06-18T08:16:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.