Related papers: Data Quality Monitoring for the Hadron Calorimeters Using Transfer Learning for Anomaly Detection

Data Quality Monitoring for the Hadron Calorimeters Using Transfer Learning for Anomaly Detection

URL: http://arxiv.org/abs/2408.16612v3
Date: Wed, 05 Nov 2025 12:48:35 GMT
Title: Data Quality Monitoring for the Hadron Calorimeters Using Transfer Learning for Anomaly Detection
Authors: Mulugeta Weldezgina Asres, Christian Walter Omlin, Long Wang, Pavel Parygin, David Yu, Jay Dittmann, The CMS-HCAL Collaboration,
Abstract summary: Transfer learning (TL) mechanisms promise to mitigate data sparsity and model complexity by utilizing pre-trained models for a new task.<n>We present the potential of TL within the context of high-dimensional ST AD with a hybrid autoencoder architecture, incorporating convolutional, graph, and recurrent neural networks.<n>This research investigates the transferability of models trained on different sections of the Calorimeter of the Compact Muon Solenoid experiment at CERN.
Score: 0.7767589715518638
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The proliferation of sensors brings an immense volume of spatio-temporal (ST) data in many domains, including monitoring, diagnostics, and prognostics applications. Data curation is a time-consuming process for a large volume of data, making it challenging and expensive to deploy data analytics platforms in new environments. Transfer learning (TL) mechanisms promise to mitigate data sparsity and model complexity by utilizing pre-trained models for a new task. Despite the triumph of TL in fields like computer vision and natural language processing, efforts on complex ST models for anomaly detection (AD) applications are limited. In this study, we present the potential of TL within the context of high-dimensional ST AD with a hybrid autoencoder architecture, incorporating convolutional, graph, and recurrent neural networks. Motivated by the need for improved model accuracy and robustness, particularly in scenarios with limited training data on systems with thousands of sensors, this research investigates the transferability of models trained on different sections of the Hadron Calorimeter of the Compact Muon Solenoid experiment at CERN. The key contributions of the study include exploring TL's potential and limitations within the context of encoder and decoder networks, revealing insights into model initialization and training configurations that enhance performance while substantially reducing trainable parameters and mitigating data contamination effects. Code: https://github.com/muleina/CMS\_HCAL\_ML\_OnlineDQM .

Related papers

MEDIC: a network for monitoring data quality in collider experiments [0.0]
Data Quality Monitoring (DQM) is a crucial component of particle physics experiments.<n>In this work, a simulation-driven approach to DQM is proposed, enabling the study and development of data-quality methodologies.<n>We introduce MEDIC, a neural network designed to learn detector behavior and perform DQM tasks.
arXiv Detail & Related papers (2025-11-22T19:53:24Z)
DeepFeatIoT: Unifying Deep Learned, Randomized, and LLM Features for Enhanced IoT Time Series Sensor Data Classification in Smart Industries [2.2120045208641184]
Internet of Things (IoT) sensors are ubiquitous technologies deployed across smart cities, industrial sites, and healthcare systems.<n>We propose a novel deep learning model, DeepFeatIoT, which integrates learned local and global features with non-learned randomized convolutional kernel-based features.<n>Our model's effectiveness is demonstrated through its consistent and generalized performance across multiple real-world IoT sensor datasets.
arXiv Detail & Related papers (2025-08-13T03:47:33Z)
DeepSeq: High-Throughput Single-Cell RNA Sequencing Data Labeling via Web Search-Augmented Agentic Generative AI Foundation Models [0.0]
Generative AI foundation models offer transformative potential for processing structured biological data.<n>We propose the use of agentic foundation models with real-time web search to automate the labeling of experimental data, achieving up to 82.5% accuracy.
arXiv Detail & Related papers (2025-06-14T23:30:22Z)
LSM-2: Learning from Incomplete Wearable Sensor Data [65.58595667477505]
This paper introduces the second generation of Large Sensor Model (LSM-2) with Adaptive and Inherited Masking (AIM)<n>AIM learns robust representations directly from incomplete data without requiring explicit imputation.<n>Our LSM-2 with AIM achieves the best performance across a diverse range of tasks, including classification, regression and generative modeling.
arXiv Detail & Related papers (2025-06-05T17:57:11Z)
Enhanced Anomaly Detection in IoMT Networks using Ensemble AI Models on the CICIoMT2024 Dataset [0.7753092380426906]
The rapid proliferation of Internet of Medical Things (IoMT) devices in healthcare has introduced unique cybersecurity challenges. This research aims to develop an advanced, real-time anomaly detection framework tailored for IoMT network traffic.
arXiv Detail & Related papers (2025-02-17T14:46:58Z)
Neural Network Modeling of Microstructure Complexity Using Digital Libraries [1.03590082373586]
We evaluate the performance of artificial and spiking neural networks in learning and predicting fatigue crack growth and Turing pattern development.<n>Our assessment suggests that the leaky integrate-and-fire neuron model offers superior predictive accuracy with fewer parameters and less memory usage.
arXiv Detail & Related papers (2025-01-30T07:44:21Z)
Data-driven tool wear prediction in milling, based on a process-integrated single-sensor approach [1.6574413179773764]
This study explores data-driven methods, in particular deep learning, for tool wear prediction.<n>It investigates the transferability of predictive models using minimal training data, validated across two processes.<n>The ConvNeXt model has an exceptional performance, achieving 99.1% accuracy in identifying tool wear.
arXiv Detail & Related papers (2024-12-27T23:10:32Z)
Scaling Wearable Foundation Models [54.93979158708164]
We investigate the scaling properties of sensor foundation models across compute, data, and model size. Using a dataset of up to 40 million hours of in-situ heart rate, heart rate variability, electrodermal activity, accelerometer, skin temperature, and altimeter per-minute data from over 165,000 people, we create LSM. Our results establish the scaling laws of LSM for tasks such as imputation, extrapolation, both across time and sensor modalities.
arXiv Detail & Related papers (2024-10-17T15:08:21Z)
Data-Augmented Predictive Deep Neural Network: Enhancing the extrapolation capabilities of non-intrusive surrogate models [0.5735035463793009]
We propose a new deep learning framework, where kernel dynamic mode decomposition (KDMD) is employed to evolve the dynamics of the latent space generated by the encoder part of a convolutional autoencoder (CAE) After adding the KDMD-decoder-extrapolated data into the original data set, we train the CAE along with a feed-forward deep neural network using the augmented data. The trained network can predict future states outside the training time interval at any out-of-training parameter samples.
arXiv Detail & Related papers (2024-10-17T09:26:14Z)
Multi-Scale Convolutional LSTM with Transfer Learning for Anomaly Detection in Cellular Networks [1.1432909951914676]
This study introduces a novel approach Multi-Scale Convolutional LSTM with Transfer Learning (TL) to detect anomalies in cellular networks. The model is initially trained from scratch using a publicly available dataset to learn typical network behavior. We compare the performance of the model trained from scratch with that of the fine-tuned model using TL.
arXiv Detail & Related papers (2024-09-30T17:51:54Z)
Large-Scale Targeted Cause Discovery with Data-Driven Learning [66.86881771339145]
We propose a novel machine learning approach for inferring causal variables of a target variable from observations. By employing a local-inference strategy, our approach scales with linear complexity in the number of variables, efficiently scaling up to thousands of variables. Empirical results demonstrate superior performance in identifying causal relationships within large-scale gene regulatory networks.
arXiv Detail & Related papers (2024-08-29T02:21:11Z)
Diffusion-Based Neural Network Weights Generation [80.89706112736353]
D2NWG is a diffusion-based neural network weights generation technique that efficiently produces high-performing weights for transfer learning. Our method extends generative hyper-representation learning to recast the latent diffusion paradigm for neural network weights generation. Our approach is scalable to large architectures such as large language models (LLMs), overcoming the limitations of current parameter generation techniques.
arXiv Detail & Related papers (2024-02-28T08:34:23Z)
DiffusionEngine: Diffusion Model is Scalable Data Engine for Object Detection [41.436817746749384]
Diffusion Model is a scalable data engine for object detection. DiffusionEngine (DE) provides high-quality detection-oriented training pairs in a single stage.
arXiv Detail & Related papers (2023-09-07T17:55:01Z)
Convolutional Monge Mapping Normalization for learning on sleep data [63.22081662149488]
We propose a new method called Convolutional Monge Mapping Normalization (CMMN) CMMN consists in filtering the signals in order to adapt their power spectrum density (PSD) to a Wasserstein barycenter estimated on training data. Numerical experiments on sleep EEG data show that CMMN leads to significant and consistent performance gains independent from the neural network architecture.
arXiv Detail & Related papers (2023-05-30T08:24:01Z)
To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis [50.31589712761807]
Large language models (LLMs) are notoriously token-hungry during pre-training, and high-quality text data on the web is approaching its scaling limit for LLMs. We investigate the consequences of repeating pre-training data, revealing that the model is susceptible to overfitting. Second, we examine the key factors contributing to multi-epoch degradation, finding that significant factors include dataset size, model parameters, and training objectives.
arXiv Detail & Related papers (2023-05-22T17:02:15Z)
Improving self-supervised pretraining models for epileptic seizure detection from EEG data [0.23624125155742057]
This paper presents various self-supervision strategies to enhance the performance of a time-series based Diffusion convolution neural network (DCRNN) model. The learned weights in the self-supervision pretraining phase can be transferred to the supervised training phase to boost the model's prediction capability.
arXiv Detail & Related papers (2022-06-28T17:15:49Z)
A data filling methodology for time series based on CNN and (Bi)LSTM neural networks [0.0]
We develop two Deep Learning models aimed at filling data gaps in time series obtained from monitored apartments in Bolzano, Italy. Our approach manages to capture the fluctuating nature of the data and shows good accuracy in reconstructing the target time series.
arXiv Detail & Related papers (2022-04-21T09:40:30Z)
Convolutional generative adversarial imputation networks for spatio-temporal missing data in storm surge simulations [86.5302150777089]
Generative Adversarial Imputation Nets (GANs) and GAN-based techniques have attracted attention as unsupervised machine learning methods. We name our proposed method as Con Conval Generative Adversarial Imputation Nets (Conv-GAIN)
arXiv Detail & Related papers (2021-11-03T03:50:48Z)
DAE : Discriminatory Auto-Encoder for multivariate time-series anomaly detection in air transportation [68.8204255655161]
We propose a novel anomaly detection model called Discriminatory Auto-Encoder (DAE) It uses the baseline of a regular LSTM-based auto-encoder but with several decoders, each getting data of a specific flight phase. Results show that the DAE achieves better results in both accuracy and speed of detection.
arXiv Detail & Related papers (2021-09-08T14:07:55Z)
Transformer-Based Behavioral Representation Learning Enables Transfer Learning for Mobile Sensing in Small Datasets [4.276883061502341]
We provide a neural architecture framework for mobile sensing data that can learn generalizable feature representations from time series. This architecture combines benefits from CNN and Trans-former architectures to enable better prediction performance.
arXiv Detail & Related papers (2021-07-09T22:26:50Z)
Towards an Automatic Analysis of CHO-K1 Suspension Growth in Microfluidic Single-cell Cultivation [63.94623495501023]
We propose a novel Machine Learning architecture, which allows us to infuse a neural deep network with human-powered abstraction on the level of data. Specifically, we train a generative model simultaneously on natural and synthetic data, so that it learns a shared representation, from which a target variable, such as the cell count, can be reliably estimated.
arXiv Detail & Related papers (2020-10-20T08:36:51Z)
Large-scale Neural Solvers for Partial Differential Equations [48.7576911714538]
Solving partial differential equations (PDE) is an indispensable part of many branches of science as many processes can be modelled in terms of PDEs. Recent numerical solvers require manual discretization of the underlying equation as well as sophisticated, tailored code for distributed computing. We examine the applicability of continuous, mesh-free neural solvers for partial differential equations, physics-informed neural networks (PINNs) We discuss the accuracy of GatedPINN with respect to analytical solutions -- as well as state-of-the-art numerical solvers, such as spectral solvers.
arXiv Detail & Related papers (2020-09-08T13:26:51Z)
Contextual-Bandit Anomaly Detection for IoT Data in Distributed Hierarchical Edge Computing [65.78881372074983]
IoT devices can hardly afford complex deep neural networks (DNN) models, and offloading anomaly detection tasks to the cloud incurs long delay. We propose and build a demo for an adaptive anomaly detection approach for distributed hierarchical edge computing (HEC) systems. We show that our proposed approach significantly reduces detection delay without sacrificing accuracy, as compared to offloading detection tasks to the cloud.
arXiv Detail & Related papers (2020-04-15T06:13:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.