Interpreting Transformers for Jet Tagging
- URL: http://arxiv.org/abs/2412.03673v2
- Date: Mon, 09 Dec 2024 03:47:39 GMT
- Title: Interpreting Transformers for Jet Tagging
- Authors: Aaron Wang, Abhijith Gandrakota, Jennifer Ngadiuba, Vivekanand Sahu, Priyansh Bhatnagar, Elham E Khoda, Javier Duarte,
- Abstract summary: This study focuses on interpreting ParT by analyzing attention heat maps and particle-pair correlations on the $eta$-$phi$ plane.
At the same time, we observe that ParT shows varying focus on important particles and subjets depending on decay, indicating that the model learns traditional jet substructure observables.
- Score: 2.512200562089791
- License:
- Abstract: Machine learning (ML) algorithms, particularly attention-based transformer models, have become indispensable for analyzing the vast data generated by particle physics experiments like ATLAS and CMS at the CERN LHC. Particle Transformer (ParT), a state-of-the-art model, leverages particle-level attention to improve jet-tagging tasks, which are critical for identifying particles resulting from proton collisions. This study focuses on interpreting ParT by analyzing attention heat maps and particle-pair correlations on the $\eta$-$\phi$ plane, revealing a binary attention pattern where each particle attends to at most one other particle. At the same time, we observe that ParT shows varying focus on important particles and subjets depending on decay, indicating that the model learns traditional jet substructure observables. These insights enhance our understanding of the model's internal workings and learning process, offering potential avenues for improving the efficiency of transformer architectures in future high-energy physics applications.
Related papers
- Mixture-of-Experts Graph Transformers for Interpretable Particle Collision Detection [36.56642608984189]
We propose a novel approach that combines a Graph Transformer model with Mixture-of-Expert layers to achieve high predictive performance.
We evaluate the model on simulated events from the ATLAS experiment, focusing on distinguishing rare Supersymmetric signal events.
This approach underscores the importance of explainability in machine learning methods applied to high energy physics.
arXiv Detail & Related papers (2025-01-06T23:28:19Z) - Particle Multi-Axis Transformer for Jet Tagging [0.774199694856838]
In this article, we propose an idea of a new architecture, Particle Multi-Axis transformer (ParMAT)
ParMAT contains local and global spatial interactions within a single unit which improves its ability to handle various input lengths.
We trained our model on JETCLASS, a publicly available large dataset that contains 100M jets of 10 different classes of particles.
arXiv Detail & Related papers (2024-06-09T10:34:16Z) - Variational Pseudo Marginal Methods for Jet Reconstruction in Particle Physics [2.223804777595989]
We introduce a Combinatorial Sequential Monte Carlo approach for inferring jet latent structures.
As a second contribution, we leverage the resulting estimator to develop a variational inference algorithm for parameter learning.
We illustrate our method's effectiveness through experiments using data generated with a collider physics generative model.
arXiv Detail & Related papers (2024-06-05T13:18:55Z) - Image and Point-cloud Classification for Jet Analysis in High-Energy Physics: A survey [3.1070277982608605]
This review paper provides a thorough illustration of applications using different machine learning (ML) and deep learning (DL) approaches.
The presented techniques can be applied to future hadron-hadron colliders (HHC), such as the high-luminosity LHC (HL-LHC) and the future circular collider - hadron-hadron (FCChh)
arXiv Detail & Related papers (2024-03-18T16:33:29Z) - Interpretable Joint Event-Particle Reconstruction for Neutrino Physics
at NOvA with Sparse CNNs and Transformers [124.29621071934693]
We present a novel neural network architecture that combines the spatial learning enabled by convolutions with the contextual learning enabled by attention.
TransformerCVN simultaneously classifies each event and reconstructs every individual particle's identity.
This architecture enables us to perform several interpretability studies which provide insights into the network's predictions.
arXiv Detail & Related papers (2023-03-10T20:36:23Z) - Particle-Based Score Estimation for State Space Model Learning in
Autonomous Driving [62.053071723903834]
Multi-object state estimation is a fundamental problem for robotic applications.
We consider learning maximum-likelihood parameters using particle methods.
We apply our method to real data collected from autonomous vehicles.
arXiv Detail & Related papers (2022-12-14T01:21:05Z) - Transformer with Implicit Edges for Particle-based Physics Simulation [135.77656965678196]
Transformer with Implicit Edges (TIE) captures the rich semantics of particle interactions in an edge-free manner.
We evaluate our model on diverse domains of varying complexity and materials.
arXiv Detail & Related papers (2022-07-22T03:45:29Z) - NeuroFluid: Fluid Dynamics Grounding with Particle-Driven Neural
Radiance Fields [65.07940731309856]
Deep learning has shown great potential for modeling the physical dynamics of complex particle systems such as fluids.
In this paper, we consider a partially observable scenario known as fluid dynamics grounding.
We propose a differentiable two-stage network named NeuroFluid.
It is shown to reasonably estimate the underlying physics of fluids with different initial shapes, viscosity, and densities.
arXiv Detail & Related papers (2022-03-03T15:13:29Z) - Physics-Integrated Variational Autoencoders for Robust and Interpretable
Generative Modeling [86.9726984929758]
We focus on the integration of incomplete physics models into deep generative models.
We propose a VAE architecture in which a part of the latent space is grounded by physics.
We demonstrate generative performance improvements over a set of synthetic and real-world datasets.
arXiv Detail & Related papers (2021-02-25T20:28:52Z) - SparseBERT: Rethinking the Importance Analysis in Self-attention [107.68072039537311]
Transformer-based models are popular for natural language processing (NLP) tasks due to its powerful capacity.
Attention map visualization of a pre-trained model is one direct method for understanding self-attention mechanism.
We propose a Differentiable Attention Mask (DAM) algorithm, which can be also applied in guidance of SparseBERT design.
arXiv Detail & Related papers (2021-02-25T14:13:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.