Inference-optimized AI and high performance computing for gravitational
wave detection at scale
- URL: http://arxiv.org/abs/2201.11133v1
- Date: Wed, 26 Jan 2022 19:00:01 GMT
- Title: Inference-optimized AI and high performance computing for gravitational
wave detection at scale
- Authors: Pranshu Chaturvedi, Asad Khan, Minyang Tian, E. A. Huerta and Huihuo
Zheng
- Abstract summary: We introduce an ensemble of artificial intelligence models for gravitational wave detection that we trained in the Summit supercomputer using 32 nodes.
We deploy our inference-optimized AI ensemble in the ThetaGPU supercomputer at Argonne Leadership Computer Facility.
Our NVIDIART-optimized AI ensemble porcessed an entire month of advanced LIGO data (including Hanford and Livingston data streams) within 50 seconds.
- Score: 3.6118662460334527
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce an ensemble of artificial intelligence models for gravitational
wave detection that we trained in the Summit supercomputer using 32 nodes,
equivalent to 192 NVIDIA V100 GPUs, within 2 hours. Once fully trained, we
optimized these models for accelerated inference using NVIDIA TensorRT. We
deployed our inference-optimized AI ensemble in the ThetaGPU supercomputer at
Argonne Leadership Computer Facility to conduct distributed inference. Using
the entire ThetaGPU supercomputer, consisting of 20 nodes each of which has 8
NVIDIA A100 Tensor Core GPUs and 2 AMD Rome CPUs, our NVIDIA TensorRT-optimized
AI ensemble porcessed an entire month of advanced LIGO data (including Hanford
and Livingston data streams) within 50 seconds. Our inference-optimized AI
ensemble retains the same sensitivity of traditional AI models, namely, it
identifies all known binary black hole mergers previously identified in this
advanced LIGO dataset and reports no misclassifications, while also providing a
3X inference speedup compared to traditional artificial intelligence models. We
used time slides to quantify the performance of our AI ensemble to process up
to 5 years worth of advanced LIGO data. In this synthetically enhanced dataset,
our AI ensemble reports an average of one misclassification for every month of
searched advanced LIGO data. We also present the receiver operating
characteristic curve of our AI ensemble using this 5 year long advanced LIGO
dataset. This approach provides the required tools to conduct accelerated,
AI-driven gravitational wave detection at scale.
Related papers
- Generative Diffusion-based Contract Design for Efficient AI Twins Migration in Vehicular Embodied AI Networks [55.15079732226397]
Embodied AI is a rapidly advancing field that bridges the gap between cyberspace and physical space.
In VEANET, embodied AI twins act as in-vehicle AI assistants to perform diverse tasks supporting autonomous driving.
arXiv Detail & Related papers (2024-10-02T02:20:42Z) - AI-Generated Images as Data Source: The Dawn of Synthetic Era [61.879821573066216]
generative AI has unlocked the potential to create synthetic images that closely resemble real-world photographs.
This paper explores the innovative concept of harnessing these AI-generated images as new data sources.
In contrast to real data, AI-generated data exhibit remarkable advantages, including unmatched abundance and scalability.
arXiv Detail & Related papers (2023-10-03T06:55:19Z) - AI ensemble for signal detection of higher order gravitational wave
modes of quasi-circular, spinning, non-precessing binary black hole mergers [0.36832029288386137]
We train AIs with 2.4 million waveforms that describe quasi-temporal, spinning, non-precessing binary black hole masses.
We then use transfer to create learning predictors that estimate the total mass of potential binary black holes.
This is the first AI ensemble designed to search for and find the higher gravitational order gravitational wave mode signals.
arXiv Detail & Related papers (2023-09-29T18:00:03Z) - Physics-inspired spatiotemporal-graph AI ensemble for the detection of higher order wave mode signals of spinning binary black hole mergers [0.2999888908665658]
We present a new class of AI models for the detection of quasi-circular, spinning, non-precessing binary black hole mergers.
These AI models combine hybrid dilated neural networks to accurately model both short- and long-range temporal information of gravitational waves.
arXiv Detail & Related papers (2023-06-27T18:09:15Z) - Tricking AI chips into Simulating the Human Brain: A Detailed
Performance Analysis [0.5354801701968198]
We evaluate multiple, cutting-edge AI-chips (Graphcore IPU, GroqChip, Nvidia GPU with inferior Cores and Google TPU) for brain simulation.
Our performance analysis reveals that the simulation problem maps extremely well onto the GPU and TPU architectures.
The GroqChip outperforms both platforms for small networks but, due to implementing some floating-point operations at reduced accuracy, is found not yet usable for brain simulation.
arXiv Detail & Related papers (2023-01-31T13:51:37Z) - ProcTHOR: Large-Scale Embodied AI Using Procedural Generation [55.485985317538194]
ProcTHOR is a framework for procedural generation of Embodied AI environments.
We demonstrate state-of-the-art results across 6 embodied AI benchmarks for navigation, rearrangement, and arm manipulation.
arXiv Detail & Related papers (2022-06-14T17:09:35Z) - FPGA-based AI Smart NICs for Scalable Distributed AI Training Systems [62.20308752994373]
We propose a new smart network interface card (NIC) for distributed AI training systems using field-programmable gate arrays (FPGAs)
Our proposed FPGA-based AI smart NIC enhances overall training performance by 1.6x at 6 nodes, with an estimated 2.5x performance improvement at 32 nodes, compared to the baseline system using conventional NICs.
arXiv Detail & Related papers (2022-04-22T21:57:00Z) - FPGA-optimized Hardware acceleration for Spiking Neural Networks [69.49429223251178]
This work presents the development of a hardware accelerator for an SNN, with off-line training, applied to an image recognition task.
The design targets a Xilinx Artix-7 FPGA, using in total around the 40% of the available hardware resources.
It reduces the classification time by three orders of magnitude, with a small 4.5% impact on the accuracy, if compared to its software, full precision counterpart.
arXiv Detail & Related papers (2022-01-18T13:59:22Z) - ElegantRL-Podracer: Scalable and Elastic Library for Cloud-Native Deep
Reinforcement Learning [141.58588761593955]
We present a library ElegantRL-podracer for cloud-native deep reinforcement learning.
It efficiently supports millions of cores to carry out massively parallel training at multiple levels.
At a low-level, each pod simulates agent-environment interactions in parallel by fully utilizing nearly 7,000 GPU cores in a single GPU.
arXiv Detail & Related papers (2021-12-11T06:31:21Z) - Confluence of Artificial Intelligence and High Performance Computing for
Accelerated, Scalable and Reproducible Gravitational Wave Detection [4.081122815035999]
We demonstrate how connecting DOE and NSF-sponsored cyberinfrastructure allows for new ways to publish machine learning models.
We then use this workflow to search for binary black hole gravitational wave signals in open source advanced LIGO data.
We find that using this workflow, an ensemble of four openly available deep learning models can be run on HAL and process the entire month of August 2017 of advanced LIGO data in just seven minutes.
arXiv Detail & Related papers (2020-12-15T19:00:29Z) - CLAN: Continuous Learning using Asynchronous Neuroevolution on Commodity
Edge Devices [3.812706195714961]
We build a prototype distributed system of Raspberry Pis communicating via WiFi running NeuroEvolutionary (NE) learning and inference.
We evaluate the performance of such a collaborative system and detail the compute/communication characteristics of different arrangements of the system.
arXiv Detail & Related papers (2020-08-27T01:49:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.