Related papers: Comparative Analysis of FPGA and GPU Performance for Machine Learning-Based Track Reconstruction at LHCb

Comparative Analysis of FPGA and GPU Performance for Machine Learning-Based Track Reconstruction at LHCb

URL: http://arxiv.org/abs/2502.02304v3
Date: Sun, 16 Feb 2025 20:13:26 GMT
Title: Comparative Analysis of FPGA and GPU Performance for Machine Learning-Based Track Reconstruction at LHCb
Authors: Fotis I. Giasemis, Vladimir Lončar, Bertrand Granado, Vladimir Vava Gligorov,
Abstract summary: Increasing luminosity and granularity at the Large Hadron Collider are driving the need for more efficient data processing solutions.<n>Machine Learning has emerged as a promising tool for charged particle tracks.
Score: 28.573896827794773
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In high-energy physics, the increasing luminosity and detector granularity at the Large Hadron Collider are driving the need for more efficient data processing solutions. Machine Learning has emerged as a promising tool for reconstructing charged particle tracks, due to its potentially linear computational scaling with detector hits. The recent implementation of a graph neural network-based track reconstruction pipeline in the first level trigger of the LHCb experiment on GPUs serves as a platform for comparative studies between computational architectures in the context of high-energy physics. This paper presents a novel comparison of the throughput of ML model inference between FPGAs and GPUs, focusing on the first step of the track reconstruction pipeline$\unicode{x2013}$an implementation of a multilayer perceptron. Using HLS4ML for FPGA deployment, we benchmark its performance against the GPU implementation and demonstrate the potential of FPGAs for high-throughput, low-latency inference without the need for an expertise in FPGA development and while consuming significantly less power.

Related papers

HAPM -- Hardware Aware Pruning Method for CNN hardware accelerators in resource constrained devices [44.99833362998488]
The present work proposes a generic hardware architecture ready to be implemented on FPGA devices. The inference speed of the design is evaluated over different resource constrained FPGA devices. We demonstrate that our hardware-aware pruning algorithm achieves a remarkable improvement of a 45 % in inference time compared to a network pruned using the standard algorithm.
arXiv Detail & Related papers (2024-08-26T07:27:12Z)
Enhancing Dropout-based Bayesian Neural Networks with Multi-Exit on FPGA [20.629635991749808]
This paper proposes an algorithm and hardware co-design framework that can generate field-programmable gate array (FPGA)-based accelerators for efficient BayesNNs. At the algorithm level, we propose novel multi-exit dropout-based BayesNNs with reduced computational and memory overheads. At the hardware level, this paper introduces a transformation framework that can generate FPGA-based accelerators for the proposed efficient BayesNNs.
arXiv Detail & Related papers (2024-06-20T17:08:42Z)
Comparing Classical and Quantum Ground State Preparation Heuristics [44.99833362998488]
Ground state preparation (GSP) is a crucial component in GSEE algorithms. In this study, we investigated whether in those cases quantum GSP methods could improve the overlap values compared to Hartree-Fock.
arXiv Detail & Related papers (2024-01-10T18:16:36Z)
Understanding the Potential of FPGA-Based Spatial Acceleration for Large Language Model Inference [11.614722231006695]
Large language models (LLMs) boasting billions of parameters have generated a significant demand for efficient deployment in inference workloads. This paper investigates the feasibility and potential of model-specific spatial acceleration for LLM inference on FPGAs.
arXiv Detail & Related papers (2023-12-23T04:27:06Z)
Fast Neural Network Inference on FPGAs for Triggering on Long-Lived Particles at Colliders [0.0]
We present two machine-learning algorithms for selecting events where neutral long-lived particles decay within the detector volume. The proposed new algorithms are proven efficient for the considered benchmark physics scenario and their accuracy is found to not degrade when accelerated on the FPGA cards.
arXiv Detail & Related papers (2023-07-11T10:17:57Z)
End-to-end codesign of Hessian-aware quantized neural networks for FPGAs and ASICs [49.358119307844035]
We develop an end-to-end workflow for the training and implementation of co-designed neural networks (NNs) This makes efficient NN implementations in hardware accessible to nonexperts, in a single open-sourced workflow. We demonstrate the workflow in a particle physics application involving trigger decisions that must operate at the 40 MHz collision rate of the Large Hadron Collider (LHC) We implement an optimized mixed-precision NN for high-momentum particle jets in simulated LHC proton-proton collisions.
arXiv Detail & Related papers (2023-04-13T18:00:01Z)
HARFLOW3D: A Latency-Oriented 3D-CNN Accelerator Toolflow for HAR on FPGA Devices [71.45672882756001]
This study introduces a novel streaming architecture based toolflow for mapping 3D Convolutional Neural Networks onto FPGAs. The HARFLOW3D toolflow takes as input a 3D CNN in ONNX format and a description of the FPGA characteristics. The ability of the toolflow to support a broad range of models and devices is shown through a number of experiments on various 3D CNN and FPGA system pairs.
arXiv Detail & Related papers (2023-03-30T08:25:27Z)
A FPGA-based architecture for real-time cluster finding in the LHCb silicon pixel detector [0.8431877864777444]
This article describes a custom VHDL firmware implementation of a two-dimensional cluster-finder architecture for reconstructing hit positions in the new VELO detector. The pre-processing allows the first level of the software trigger to accept a 11% higher rate of events. It additionally allows the raw pixel data to be dropped at the readout level, thus saving approximately 14% of the DAQ bandwidth.
arXiv Detail & Related papers (2023-02-08T10:08:34Z)
A Comprehensive Study on Large-Scale Graph Training: Benchmarking and Rethinking [124.21408098724551]
Large-scale graph training is a notoriously challenging problem for graph neural networks (GNNs) We present a new ensembling training manner, named EnGCN, to address the existing issues. Our proposed method has achieved new state-of-the-art (SOTA) performance on large-scale datasets.
arXiv Detail & Related papers (2022-10-14T03:43:05Z)
LL-GNN: Low Latency Graph Neural Networks on FPGAs for High Energy Physics [45.666822327616046]
This work presents a novel reconfigurable architecture for Low Graph Neural Network (LL-GNN) designs for particle detectors. The LL-GNN design advances the next generation of trigger systems by enabling sophisticated algorithms to process experimental data efficiently.
arXiv Detail & Related papers (2022-09-28T12:55:35Z)
GPU-Accelerated Machine Learning in Non-Orthogonal Multiple Access [71.58925117604039]
Non-orthogonal multiple access (NOMA) is an interesting technology that enables massive connectivity as required in future 5G and 6G networks. We propose a neural network architecture that combines the advantages of both linear and non-linear processing.
arXiv Detail & Related papers (2022-06-13T09:38:23Z)
Accelerated Charged Particle Tracking with Graph Neural Networks on FPGAs [0.0]
We develop and study FPGA implementations of algorithms for charged particle tracking based on graph neural networks. We find a considerable speedup over CPU-based execution is possible, potentially enabling such algorithms to be used effectively in future computing.
arXiv Detail & Related papers (2020-11-30T18:17:43Z)
Distance-Weighted Graph Neural Networks on FPGAs for Real-Time Particle Reconstruction in High Energy Physics [11.125632758828266]
We discuss how to design distance-weighted graph networks that can be executed with a latency of less than 1$mumathrms$ on an FPGA. We consider a representative task associated to particle reconstruction and identification in a next-generation calorimeter operating at a particle collider. We convert the compressed models into firmware to be implemented on an FPGA.
arXiv Detail & Related papers (2020-08-08T21:26:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.