Machine-Learning Compression for Particle Physics Discoveries
- URL: http://arxiv.org/abs/2210.11489v1
- Date: Thu, 20 Oct 2022 18:00:04 GMT
- Title: Machine-Learning Compression for Particle Physics Discoveries
- Authors: Jack H. Collins, Yifeng Huang, Simon Knapen, Benjamin Nachman, Daniel
Whiteson
- Abstract summary: In collider-based particle and nuclear physics experiments, data are produced at such extreme rates that only a subset can be recorded for later analysis.
We propose a strategy that bridges these paradigms by compressing entire events for generic offline analysis but at a lower fidelity.
- Score: 4.432585853295899
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In collider-based particle and nuclear physics experiments, data are produced
at such extreme rates that only a subset can be recorded for later analysis.
Typically, algorithms select individual collision events for preservation and
store the complete experimental response. A relatively new alternative strategy
is to additionally save a partial record for a larger subset of events,
allowing for later specific analysis of a larger fraction of events. We propose
a strategy that bridges these paradigms by compressing entire events for
generic offline analysis but at a lower fidelity. An optimal-transport-based
$\beta$ Variational Autoencoder (VAE) is used to automate the compression and
the hyperparameter $\beta$ controls the compression fidelity. We introduce a
new approach for multi-objective learning functions by simultaneously learning
a VAE appropriate for all values of $\beta$ through parameterization. We
present an example use case, a di-muon resonance search at the Large Hadron
Collider (LHC), where we show that simulated data compressed by our $\beta$-VAE
has enough fidelity to distinguish distinct signal morphologies.
Related papers
- Theoretical Convergence Guarantees for Variational Autoencoders [2.8167997311962942]
Variational Autoencoders (VAE) are popular generative models used to sample from complex data distributions.
This paper aims to bridge that gap by providing non-asymptotic convergence guarantees for VAE trained using both Gradient Descent and Adam algorithms.
Our theoretical analysis applies to both Linear VAE and Deep Gaussian VAE, as well as several VAE variants, including $beta$-VAE and IWAE.
arXiv Detail & Related papers (2024-10-22T07:12:38Z) - FABind: Fast and Accurate Protein-Ligand Binding [127.7790493202716]
$mathbfFABind$ is an end-to-end model that combines pocket prediction and docking to achieve accurate and fast protein-ligand binding.
Our proposed model demonstrates strong advantages in terms of effectiveness and efficiency compared to existing methods.
arXiv Detail & Related papers (2023-10-10T16:39:47Z) - On Kinetic Optimal Probability Paths for Generative Models [42.12806492124782]
Recent successful generative models are trained by fitting a neural network to an a-priori defined tractable probability density path taking noise to training examples.
In this paper we investigate the space of Gaussian probability paths, which includes diffusion paths as an instance, and look for an optimal member in some useful sense.
arXiv Detail & Related papers (2023-06-11T08:54:12Z) - One-Dimensional Deep Image Prior for Curve Fitting of S-Parameters from
Electromagnetic Solvers [57.441926088870325]
Deep Image Prior (DIP) is a technique that optimized the weights of a randomly-d convolutional neural network to fit a signal from noisy or under-determined measurements.
Relative to publicly available implementations of Vector Fitting (VF), our method shows superior performance on nearly all test examples.
arXiv Detail & Related papers (2023-06-06T20:28:37Z) - Capturing dynamical correlations using implicit neural representations [85.66456606776552]
We develop an artificial intelligence framework which combines a neural network trained to mimic simulated data from a model Hamiltonian with automatic differentiation to recover unknown parameters from experimental data.
In doing so, we illustrate the ability to build and train a differentiable model only once, which then can be applied in real-time to multi-dimensional scattering data.
arXiv Detail & Related papers (2023-04-08T07:55:36Z) - Multi-Rate VAE: Train Once, Get the Full Rate-Distortion Curve [29.86440019821837]
Variational autoencoders (VAEs) are powerful tools for learning latent representations of data used in a wide range of applications.
In this paper, we introduce Multi-Rate VAE, a computationally efficient framework for learning optimal parameters corresponding to various $beta$ in a single training run.
arXiv Detail & Related papers (2022-12-07T19:02:34Z) - Easy Differentially Private Linear Regression [16.325734286930764]
We study an algorithm which uses the exponential mechanism to select a model with high Tukey depth from a collection of non-private regression models.
We find that this algorithm obtains strong empirical performance in the data-rich setting.
arXiv Detail & Related papers (2022-08-15T17:42:27Z) - Towards Sample-Optimal Compressive Phase Retrieval with Sparse and
Generative Priors [59.33977545294148]
We show that $O(k log L)$ samples suffice to guarantee that the signal is close to any vector that minimizes an amplitude-based empirical loss function.
We adapt this result to sparse phase retrieval, and show that $O(s log n)$ samples are sufficient for a similar guarantee when the underlying signal is $s$-sparse and $n$-dimensional.
arXiv Detail & Related papers (2021-06-29T12:49:54Z) - A Provably Efficient Sample Collection Strategy for Reinforcement
Learning [123.69175280309226]
One of the challenges in online reinforcement learning (RL) is that the agent needs to trade off the exploration of the environment and the exploitation of the samples to optimize its behavior.
We propose to tackle the exploration-exploitation problem following a decoupled approach composed of: 1) An "objective-specific" algorithm that prescribes how many samples to collect at which states, as if it has access to a generative model (i.e., sparse simulator of the environment); 2) An "objective-agnostic" sample collection responsible for generating the prescribed samples as fast as possible.
arXiv Detail & Related papers (2020-07-13T15:17:35Z) - On Compression Principle and Bayesian Optimization for Neural Networks [0.0]
We propose a compression principle that states that an optimal predictive model is the one that minimizes a total compressed message length of all data and model definition while guarantees decodability.
We show that dropout can be used for a continuous dimensionality reduction that allows to find optimal network dimensions as required by the compression principle.
arXiv Detail & Related papers (2020-06-23T03:23:47Z) - Optimizing Vessel Trajectory Compression [71.42030830910227]
In previous work we introduced a trajectory detection module that can provide summarized representations of vessel trajectories by consuming AIS positional messages online.
This methodology can provide reliable trajectory synopses with little deviations from the original course by discarding at least 70% of the raw data as redundant.
However, such trajectory compression is very sensitive to parametrization.
We take into account the type of each vessel in order to provide a suitable configuration that can yield improved trajectory synopses.
arXiv Detail & Related papers (2020-05-11T20:38:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.