Related papers: TrackCore-F: Deploying Transformer-Based Subatomic Particle Tracking on FPGAs

TrackCore-F: Deploying Transformer-Based Subatomic Particle Tracking on FPGAs

URL: http://arxiv.org/abs/2509.26335v1
Date: Tue, 30 Sep 2025 14:44:43 GMT
Title: TrackCore-F: Deploying Transformer-Based Subatomic Particle Tracking on FPGAs
Authors: Arjan Blankestijn, Uraz Odyurt, Amirreza Yousefzadeh,
Abstract summary: We aim to develop tools for monolithic, or partitioned Transformer synthesis, specifically targeting inference.<n>Our primary use-case involves two machine learning model designs for tracking, derived from the TrackFormers project.
Score: 0.2851702684899107
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The Transformer Machine Learning (ML) architecture has been gaining considerable momentum in recent years. In particular, computational High-Energy Physics tasks such as jet tagging and particle track reconstruction (tracking), have either achieved proper solutions, or reached considerable milestones using Transformers. On the other hand, the use of specialised hardware accelerators, especially FPGAs, is an effective method to achieve online, or pseudo-online latencies. The development and integration of Transformer-based ML to FPGAs is still ongoing and the support from current tools is very limited to non-existent. Additionally, FPGA resources present a significant constraint. Considering the model size alone, while smaller models can be deployed directly, larger models are to be partitioned in a meaningful and ideally, automated way. We aim to develop methodologies and tools for monolithic, or partitioned Transformer synthesis, specifically targeting inference. Our primary use-case involves two machine learning model designs for tracking, derived from the TrackFormers project. We elaborate our development approach, present preliminary results, and provide comparisons.

Related papers

Plain Transformers are Surprisingly Powerful Link Predictors [57.01966734467712]
Link prediction is a core challenge in graph machine learning, demanding models that capture rich and complex topological dependencies.<n>While Graph Neural Networks (GNNs) are the standard solution, state-of-the-art pipelines often rely on explicit structurals or memory-intensive node embeddings.<n>We present PENCIL, an encoder-only plain Transformer that replaces hand-crafted priors with attention over sampled local subgraphs.
arXiv Detail & Related papers (2026-02-02T02:45:52Z)
SnapGen++: Unleashing Diffusion Transformers for Efficient High-Fidelity Image Generation on Edge Devices [72.0937240883345]
Recent advances in diffusion transformers (DiTs) have set new standards in image generation, yet remain impractical for on-device deployment.<n>We present an efficient DiT framework tailored for mobile and edge devices that achieves transformer-level generation quality under strict resource constraints.
arXiv Detail & Related papers (2026-01-13T07:46:46Z)
Sub-microsecond Transformers for Jet Tagging on FPGAs [36.414144954711865]
We present the first sub-microsecond transformer implementation on an FPGA achieving competitive performance for state-of-the-art high-energy physics benchmarks.<n>Transformers have shown exceptional performance on multiple tasks in modern machine learning applications, including jet tagging at the CERN Large Hadron Collider (LHC)<n>This work advances the next-generation trigger systems for the High Luminosity LHC, enabling the use of transformers for real-time applications in high-energy physics and beyond.
arXiv Detail & Related papers (2025-10-26T23:13:00Z)
Quantized Visual Geometry Grounded Transformer [67.15451442018258]
This paper proposes the first Quantization framework for VGGTs, namely QuantVGGT.<n>We introduce Dual-Smoothed Fine-Grained Quantization, which integrates pre-global Hadamard rotation and post-local channel smoothing.<n>We also design Noise-Filtered Diverse Sampling, which filters outliers via deep-layer statistics.
arXiv Detail & Related papers (2025-09-25T15:17:11Z)
On-device AI: Quantization-aware Training of Transformers in Time-Series [0.0]
The Transformer model is by far the most compelling of these AI models. My research focuses on optimizing the Transformer model for time-series forecasting tasks. The optimized model will be deployed as hardware accelerators on embedded Field Programmable Gate Arrays (FPGAs)
arXiv Detail & Related papers (2024-08-29T12:49:22Z)
Investigating Resource-efficient Neutron/Gamma Classification ML Models Targeting eFPGAs [0.0]
Open-source embedded FPGA (eFPGA) frameworks provide an alternate, more flexible pathway for implementing machine learning models in hardware. We explore the parameter space for eFPGA implementations of fully-connected neural network (fcNN) and boosted decision tree (BDT) models. The results of the study will be used to aid the specification of an eFPGA fabric, which will be integrated as part of a test chip.
arXiv Detail & Related papers (2024-04-19T20:03:30Z)
Benchmarking and In-depth Performance Study of Large Language Models on Habana Gaudi Processors [5.432613942292548]
Transformer models have achieved remarkable success in various machine learning tasks but suffer from high computational complexity and resource requirements. Specialized AI hardware accelerators, such as the Habana GAUDI architecture, offer a promising solution to tackle these issues. This paper explores the untapped potential of using GAUDI processors to accelerate Transformer-based models, addressing key challenges in the process.
arXiv Detail & Related papers (2023-09-29T04:49:35Z)
Full Stack Optimization of Transformer Inference: a Survey [58.55475772110702]
Transformer models achieve superior accuracy across a wide range of applications. The amount of compute and bandwidth required for inference of recent Transformer models is growing at a significant rate. There has been an increased focus on making Transformer models more efficient.
arXiv Detail & Related papers (2023-02-27T18:18:13Z)
LL-GNN: Low Latency Graph Neural Networks on FPGAs for High Energy Physics [45.666822327616046]
This work presents a novel reconfigurable architecture for Low Graph Neural Network (LL-GNN) designs for particle detectors. The LL-GNN design advances the next generation of trigger systems by enabling sophisticated algorithms to process experimental data efficiently.
arXiv Detail & Related papers (2022-09-28T12:55:35Z)
Joint Spatial-Temporal and Appearance Modeling with Transformer for Multiple Object Tracking [59.79252390626194]
We propose a novel solution named TransSTAM, which leverages Transformer to model both the appearance features of each object and the spatial-temporal relationships among objects. The proposed method is evaluated on multiple public benchmarks including MOT16, MOT17, and MOT20, and it achieves a clear performance improvement in both IDF1 and HOTA.
arXiv Detail & Related papers (2022-05-31T01:19:18Z)
Efficient pre-training objectives for Transformers [84.64393460397471]
We study several efficient pre-training objectives for Transformers-based models. We prove that eliminating the MASK token and considering the whole output during the loss are essential choices to improve performance.
arXiv Detail & Related papers (2021-04-20T00:09:37Z)
FTRANS: Energy-Efficient Acceleration of Transformers using FPGA [11.032972017827248]
We propose an efficient acceleration framework, Ftrans, for transformer-based large scale language representations. Our framework significantly reduces the model size of NLP models by up to 16 times. Our FPGA design achieves 27.07x and 81x improvement in performance and energy efficiency compared to CPU, and up to 8.80x improvement in energy efficiency compared to GPU.
arXiv Detail & Related papers (2020-07-16T18:58:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.