Simulation-Assisted Decorrelation for Resonant Anomaly Detection
- URL: http://arxiv.org/abs/2009.02205v1
- Date: Fri, 4 Sep 2020 14:02:15 GMT
- Title: Simulation-Assisted Decorrelation for Resonant Anomaly Detection
- Authors: Kees Benkendorfer, Luc Le Pottier, and Benjamin Nachman
- Abstract summary: A growing number of weak- and unsupervised machine learning approaches to anomaly detection are being proposed.
One of the examples is the search for resonant new physics, where a bump hunt can be performed in an invariant mass spectrum.
We explore two solutions to this challenge by incorporating minimally prototypical simulation into the learning.
- Score: 1.5675763601034223
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A growing number of weak- and unsupervised machine learning approaches to
anomaly detection are being proposed to significantly extend the search program
at the Large Hadron Collider and elsewhere. One of the prototypical examples
for these methods is the search for resonant new physics, where a bump hunt can
be performed in an invariant mass spectrum. A significant challenge to methods
that rely entirely on data is that they are susceptible to sculpting artificial
bumps from the dependence of the machine learning classifier on the invariant
mass. We explore two solutions to this challenge by minimally incorporating
simulation into the learning. In particular, we study the robustness of
Simulation Assisted Likelihood-free Anomaly Detection (SALAD) to correlations
between the classifier and the invariant mass. Next, we propose a new approach
that only uses the simulation for decorrelation but the Classification without
Labels (CWoLa) approach for achieving signal sensitivity. Both methods are
compared using a full background fit analysis on simulated data from the LHC
Olympics and are robust to correlations in the data.
Related papers
- Diffusion posterior sampling for simulation-based inference in tall data settings [53.17563688225137]
Simulation-based inference ( SBI) is capable of approximating the posterior distribution that relates input parameters to a given observation.
In this work, we consider a tall data extension in which multiple observations are available to better infer the parameters of the model.
We compare our method to recently proposed competing approaches on various numerical experiments and demonstrate its superiority in terms of numerical stability and computational cost.
arXiv Detail & Related papers (2024-04-11T09:23:36Z) - Deep Generative Models for Detector Signature Simulation: A Taxonomic Review [0.0]
Signatures from particle physics detectors are low-level objects (such as energy depositions or tracks) encoding the physics of collisions.
The complete simulation of them in a detector is a computational and storage-intensive task.
We conduct a comprehensive and exhaustive taxonomic review of the existing literature on the simulation of detector signatures.
arXiv Detail & Related papers (2023-12-15T08:27:39Z) - Generalized Oversampling for Learning from Imbalanced datasets and
Associated Theory [0.0]
In supervised learning, it is quite frequent to be confronted with real imbalanced datasets.
We propose a data augmentation procedure, the GOLIATH algorithm, based on kernel density estimates.
We evaluate the performance of the GOLIATH algorithm in imbalanced regression situations.
arXiv Detail & Related papers (2023-08-05T23:08:08Z) - Capturing dynamical correlations using implicit neural representations [85.66456606776552]
We develop an artificial intelligence framework which combines a neural network trained to mimic simulated data from a model Hamiltonian with automatic differentiation to recover unknown parameters from experimental data.
In doing so, we illustrate the ability to build and train a differentiable model only once, which then can be applied in real-time to multi-dimensional scattering data.
arXiv Detail & Related papers (2023-04-08T07:55:36Z) - MAPS: A Noise-Robust Progressive Learning Approach for Source-Free
Domain Adaptive Keypoint Detection [76.97324120775475]
Cross-domain keypoint detection methods always require accessing the source data during adaptation.
This paper considers source-free domain adaptive keypoint detection, where only the well-trained source model is provided to the target domain.
arXiv Detail & Related papers (2023-02-09T12:06:08Z) - Learning to Bound Counterfactual Inference in Structural Causal Models
from Observational and Randomised Data [64.96984404868411]
We derive a likelihood characterisation for the overall data that leads us to extend a previous EM-based algorithm.
The new algorithm learns to approximate the (unidentifiability) region of model parameters from such mixed data sources.
It delivers interval approximations to counterfactual results, which collapse to points in the identifiable case.
arXiv Detail & Related papers (2022-12-06T12:42:11Z) - Learning Mixtures of Low-Rank Models [89.39877968115833]
We study the problem of learning computational mixtures of low-rank models.
We develop an algorithm that is guaranteed to recover the unknown matrices with near-optimal sample.
In addition, the proposed algorithm is provably stable against random noise.
arXiv Detail & Related papers (2020-09-23T17:53:48Z) - Categorical anomaly detection in heterogeneous data using minimum
description length clustering [3.871148938060281]
We propose a meta-algorithm for enhancing any MDL-based anomaly detection model to deal with heterogeneous data.
Our experimental results show that using a discrete mixture model provides competitive performance relative to two previous anomaly detection algorithms.
arXiv Detail & Related papers (2020-06-14T14:48:37Z) - SUOD: Accelerating Large-Scale Unsupervised Heterogeneous Outlier
Detection [63.253850875265115]
Outlier detection (OD) is a key machine learning (ML) task for identifying abnormal objects from general samples.
We propose a modular acceleration system, called SUOD, to address it.
arXiv Detail & Related papers (2020-03-11T00:22:50Z) - Correlation-aware Deep Generative Model for Unsupervised Anomaly
Detection [9.578395294627057]
Unsupervised anomaly detection aims to identify anomalous samples from highly complex and unstructured data.
We propose a method of Correlation aware unsupervised Anomaly detection via Deep Gaussian Mixture Model (CADGMM)
Experiments on real-world datasets demonstrate the effectiveness of the proposed method.
arXiv Detail & Related papers (2020-02-18T03:32:06Z) - Simulation Assisted Likelihood-free Anomaly Detection [3.479254848034425]
This paper introduces a hybrid method that makes the best of both approaches to model-independent searches.
For potential signals that are resonant in one known feature, this new method first learns a parameterized reweighting function to morph a given simulation to match the data in sidebands.
The background estimation from the reweighted simulation allows for non-trivial correlations between features used for classification and the resonant feature.
arXiv Detail & Related papers (2020-01-14T19:00:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.