Training Deep 3D Convolutional Neural Networks to Extract BSM Physics Parameters Directly from HEP Data: a Proof-of-Concept Study Using Monte Carlo Simulations
- URL: http://arxiv.org/abs/2311.13060v3
- Date: Fri, 15 Nov 2024 16:55:47 GMT
- Title: Training Deep 3D Convolutional Neural Networks to Extract BSM Physics Parameters Directly from HEP Data: a Proof-of-Concept Study Using Monte Carlo Simulations
- Authors: S. Dubey, T. E. Browder, S. Kohani, R. Mandal, A. Sibidanov, R. Sinha,
- Abstract summary: We propose a simple but novel data representation that transforms the angular and kinematic distributions into "quasi-images"
As a proof-of-concept, we train a 34-layer Residual Neural Network to regress on these images and determine information about the Wilson Coefficient $C_9$ in Monte Carlo simulations of $B0 rightarrow K*0mu+mu-$ decays.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We report on a novel application of computer vision techniques to extract beyond the Standard Model parameters directly from high energy physics flavor data. We propose a simple but novel data representation that transforms the angular and kinematic distributions into "quasi-images", which are used to train a convolutional neural network to perform regression tasks, similar to fitting. As a proof-of-concept, we train a 34-layer Residual Neural Network to regress on these images and determine information about the Wilson Coefficient $C_{9}$ in Monte Carlo simulations of $B^0 \rightarrow K^{*0}\mu^{+}\mu^{-}$ decays. The method described here can be generalized and may find applicability across a variety of experiments.
Related papers
- Hidden Activations Are Not Enough: A General Approach to Neural Network Predictions [0.0]
We introduce a novel mathematical framework for analyzing neural networks using tools from quiver representation theory.
By leveraging the induced quiver representation of a data sample, we capture more information than traditional hidden layer outputs.
Results are architecture-agnostic and task-agnostic, making them broadly applicable.
arXiv Detail & Related papers (2024-09-20T02:35:13Z) - Bayesian Inverse Graphics for Few-Shot Concept Learning [3.475273727432576]
We present a Bayesian model of perception that learns using only minimal data.
We show how this representation can be used for downstream tasks such as few-shot classification and estimation.
arXiv Detail & Related papers (2024-09-12T18:30:41Z) - Geometry-Informed Neural Operator for Large-Scale 3D PDEs [76.06115572844882]
We propose the geometry-informed neural operator (GINO) to learn the solution operator of large-scale partial differential equations.
We successfully trained GINO to predict the pressure on car surfaces using only five hundred data points.
arXiv Detail & Related papers (2023-09-01T16:59:21Z) - Improved Convergence Guarantees for Shallow Neural Networks [91.3755431537592]
We prove convergence of depth 2 neural networks, trained via gradient descent, to a global minimum.
Our model has the following features: regression with quadratic loss function, fully connected feedforward architecture, RelU activations, Gaussian data instances, adversarial labels.
They strongly suggest that, at least in our model, the convergence phenomenon extends well beyond the NTK regime''
arXiv Detail & Related papers (2022-12-05T14:47:52Z) - Learning to Learn with Generative Models of Neural Network Checkpoints [71.06722933442956]
We construct a dataset of neural network checkpoints and train a generative model on the parameters.
We find that our approach successfully generates parameters for a wide range of loss prompts.
We apply our method to different neural network architectures and tasks in supervised and reinforcement learning.
arXiv Detail & Related papers (2022-09-26T17:59:58Z) - Mixed Effects Neural ODE: A Variational Approximation for Analyzing the
Dynamics of Panel Data [50.23363975709122]
We propose a probabilistic model called ME-NODE to incorporate (fixed + random) mixed effects for analyzing panel data.
We show that our model can be derived using smooth approximations of SDEs provided by the Wong-Zakai theorem.
We then derive Evidence Based Lower Bounds for ME-NODE, and develop (efficient) training algorithms.
arXiv Detail & Related papers (2022-02-18T22:41:51Z) - Neural Capacitance: A New Perspective of Neural Network Selection via
Edge Dynamics [85.31710759801705]
Current practice requires expensive computational costs in model training for performance prediction.
We propose a novel framework for neural network selection by analyzing the governing dynamics over synaptic connections (edges) during training.
Our framework is built on the fact that back-propagation during neural network training is equivalent to the dynamical evolution of synaptic connections.
arXiv Detail & Related papers (2022-01-11T20:53:15Z) - Physics Validation of Novel Convolutional 2D Architectures for Speeding
Up High Energy Physics Simulations [0.0]
We apply Geneversarative Adrial Networks (GANs), a deep learning technique, to replace the calorimeter detector simulations.
We develop new two-dimensional convolutional networks to solve the same 3D image generation problem faster.
Our results demonstrate a high physics accuracy and further consolidate the use of GANs for fast detector simulations.
arXiv Detail & Related papers (2021-05-19T07:24:23Z) - Modeling the Nonsmoothness of Modern Neural Networks [35.93486244163653]
We quantify the nonsmoothness using a feature named the sum of the magnitude of peaks (SMP)
We envision that the nonsmoothness feature can potentially be used as a forensic tool for regression-based applications of neural networks.
arXiv Detail & Related papers (2021-03-26T20:55:19Z) - The use of Convolutional Neural Networks for signal-background
classification in Particle Physics experiments [0.4301924025274017]
We present an extensive convolutional neural architecture search, achieving high accuracy for signal/background discrimination for a HEP classification use-case.
We demonstrate among other things that we can achieve the same accuracy as complex ResNet architectures with CNNs with less parameters.
arXiv Detail & Related papers (2020-02-13T19:54:46Z) - Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks.
We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.