Collection and harmonization of system logs and prototypal Analytics
services with the Elastic (ELK) suite at the INFN-CNAF computing centre
- URL: http://arxiv.org/abs/2106.02612v1
- Date: Thu, 13 May 2021 10:21:55 GMT
- Title: Collection and harmonization of system logs and prototypal Analytics
services with the Elastic (ELK) suite at the INFN-CNAF computing centre
- Authors: Tommaso Diotalevi, Antonio Falabella, Barbara Martelli, Diego
Michelotto, Lucia Morganti, Daniele Bonacorsi, Luca Giommi, Simone Rossi
Tisbeni
- Abstract summary: The distributed Grid infrastructure for High Energy Physics experiments at the Large Hadron Collider (LHC) in Geneva comprises a set of computing centres, spread all over the world.
In Italy, the Tier-1 functionalities are served by the INFN-CNAF data center, which provides also computing and storage resources to more than twenty non-LHC experiments.
A working implementation of a system that collects, parses and displays the log information from CNAF data sources is presented.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The distributed Grid infrastructure for High Energy Physics experiments at
the Large Hadron Collider (LHC) in Geneva comprises a set of computing centres,
spread all over the world, as part of the Worldwide LHC Computing Grid (WLCG).
In Italy, the Tier-1 functionalities are served by the INFN-CNAF data center,
which provides also computing and storage resources to more than twenty non-LHC
experiments. For this reason, a high amount of logs are collected each day from
various sources, which are highly heterogeneous and difficult to harmonize. In
this contribution, a working implementation of a system that collects, parses
and displays the log information from CNAF data sources and the investigation
of a Machine Learning based predictive maintenance system, is presented.
Related papers
- Training Report of TeleChat3-MoE [77.94641922160359]
This technical report mainly presents the underlying training infrastructure that enables reliable and efficient scaling to frontier model sizes.<n>We detail systematic methodologies for operator-level and end-to-end numerical verification accuracy, ensuring consistency across hardware platforms.<n>A systematic parallelization framework, leveraging analytical estimation and integer linear programming, is also proposed to optimize multi-dimensional parallelism configurations.
arXiv Detail & Related papers (2025-12-30T11:42:14Z) - Efficiency Boost in Decentralized Optimization: Reimagining Neighborhood Aggregation with Minimal Overhead [3.485627109660862]
We introduce DYNAWEIGHT, a novel framework to information aggregation in multi-agent networks.<n> DYNAWEIGHT dynamically allocates weights to neighboring servers based on their relative losses on local datasets.
arXiv Detail & Related papers (2025-09-26T10:34:06Z) - The AI_INFN Platform: Artificial Intelligence Development in the Cloud [0.0]
The INFN initiative AI_INFN (Artificial Intelligence at INFN) seeks to promote the use of ML methods across various INFN research scenarios.<n>We will present preliminary benchmarks, functional tests, and case studies, demonstrating both performance and integration outcomes.
arXiv Detail & Related papers (2025-09-26T09:40:51Z) - Loong: Synthesize Long Chain-of-Thoughts at Scale through Verifiers [103.4410890572479]
We introduce the Loong Project: an open-source framework for scalable synthetic data generation and verification.<n>LoongBench is a curated seed dataset containing 8,729 human-vetted examples across 12 domains.<n>LoongEnv is a modular synthetic data generation environment that supports multiple prompting strategies to produce new question-answer-code triples.
arXiv Detail & Related papers (2025-09-03T06:42:40Z) - Tackling Data Heterogeneity in Federated Time Series Forecasting [61.021413959988216]
Time series forecasting plays a critical role in various real-world applications, including energy consumption prediction, disease transmission monitoring, and weather forecasting.
Most existing methods rely on a centralized training paradigm, where large amounts of data are collected from distributed devices to a central cloud server.
We propose a novel framework, Fed-TREND, to address data heterogeneity by generating informative synthetic data as auxiliary knowledge carriers.
arXiv Detail & Related papers (2024-11-24T04:56:45Z) - A study of the impact of generative AI-based data augmentation on
software metadata classification [1.1356542363919058]
We train a machine learning-based model using the neural contextual representations of the comments and their corresponding codes to predict the usefulness of code-comments pair.
In the official assessment, our system achieves a 4% increase in F1-score from baseline and the quality of generated data.
arXiv Detail & Related papers (2023-10-14T10:47:10Z) - Linking the Dynamic PicoProbe Analytical Electron-Optical Beam Line /
Microscope to Supercomputers [39.52789559084336]
Dynamic PicoProbe at Argonne National Laboratory is undergoing upgrades that will enable it to produce up to 100s of GB of data per day.
While this data is highly important for both fundamental science and industrial applications, there is currently limited on-site infrastructure to handle these high-volume data streams.
We address this problem by providing a software architecture capable of supporting neighboring large-scale data transfers to neighboring supercomputers at the Argonne Leadership Computing Facility.
This infrastructure supports expected workloads and also provides domain scientists the ability to reinterrogate data from past experiments to yield additional scientific value and derive new insights.
arXiv Detail & Related papers (2023-08-25T23:07:58Z) - Tackling Computational Heterogeneity in FL: A Few Theoretical Insights [68.8204255655161]
We introduce and analyse a novel aggregation framework that allows for formalizing and tackling computational heterogeneous data.
Proposed aggregation algorithms are extensively analyzed from a theoretical, and an experimental prospective.
arXiv Detail & Related papers (2023-07-12T16:28:21Z) - An IoT Cloud and Big Data Architecture for the Maintenance of Home
Appliances [0.0722732388409495]
This work introduces a distributed and scalable platform architecture that can be deployed for efficient big data collection and analytics.
The proposed system was tested with a case study for Predictive Maintenance of Home Appliances.
The experimental results demonstrated that the presented system could be advantageous for tackling real-world IoT scenarios in a cost-effective and local approach.
arXiv Detail & Related papers (2022-10-25T13:25:00Z) - FedHiSyn: A Hierarchical Synchronous Federated Learning Framework for
Resource and Data Heterogeneity [56.82825745165945]
Federated Learning (FL) enables training a global model without sharing the decentralized raw data stored on multiple devices to protect data privacy.
We propose a hierarchical synchronous FL framework, i.e., FedHiSyn, to tackle the problems of straggler effects and outdated models.
We evaluate the proposed framework based on MNIST, EMNIST, CIFAR10 and CIFAR100 datasets and diverse heterogeneous settings of devices.
arXiv Detail & Related papers (2022-06-21T17:23:06Z) - Weighted Ensembles for Active Learning with Adaptivity [60.84896785303314]
This paper presents an ensemble of GP models with weights adapted to the labeled data collected incrementally.
Building on this novel EGP model, a suite of acquisition functions emerges based on the uncertainty and disagreement rules.
An adaptively weighted ensemble of EGP-based acquisition functions is also introduced to further robustify performance.
arXiv Detail & Related papers (2022-06-10T11:48:49Z) - Multi-Edge Server-Assisted Dynamic Federated Learning with an Optimized
Floating Aggregation Point [51.47520726446029]
cooperative edge learning (CE-FL) is a distributed machine learning architecture.
We model the processes taken during CE-FL, and conduct analytical training.
We show the effectiveness of our framework with the data collected from a real-world testbed.
arXiv Detail & Related papers (2022-03-26T00:41:57Z) - Federated Stochastic Gradient Descent Begets Self-Induced Momentum [151.4322255230084]
Federated learning (FL) is an emerging machine learning method that can be applied in mobile edge systems.
We show that running to the gradient descent (SGD) in such a setting can be viewed as adding a momentum-like term to the global aggregation process.
arXiv Detail & Related papers (2022-02-17T02:01:37Z) - Real-Time Anomaly Detection in Data Centers for Log-based Predictive
Maintenance using an Evolving Fuzzy-Rule-Based Approach [0.0]
We focus on the Tier-1 data center of the Italian Institute for Nuclear Physics (INFN), which supports the high-energy physics experiments at the Large Hadron Collider (LHC) in Geneva.
We propose a real-time approach to monitor and classify log records based on sliding time windows, and a time-varying evolving fuzzy-rule-based classification model.
arXiv Detail & Related papers (2020-04-25T21:19:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.