Related papers: KPIs-Based Clustering and Visualization of HPC jobs: a Feature Reduction Approach

KPIs-Based Clustering and Visualization of HPC jobs: a Feature Reduction Approach

URL: http://arxiv.org/abs/2312.06534v1
Date: Mon, 11 Dec 2023 17:13:54 GMT
Title: KPIs-Based Clustering and Visualization of HPC jobs: a Feature Reduction Approach
Authors: Mohamed Soliman Halawa and Rebeca P. D\'iaz-Redondo and Ana Fern\'andez-Vilas
Abstract summary: HPC systems need to be constantly monitored to ensure their stability. The monitoring systems collect a tremendous amount of data about different parameters or Key Performance Indicators (KPIs), such as resource usage, IO waiting time, etc. A proper analysis of this data, usually stored as time series, can provide insight in choosing the right management strategies as well as the early detection of issues.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: High-Performance Computing (HPC) systems need to be constantly monitored to ensure their stability. The monitoring systems collect a tremendous amount of data about different parameters or Key Performance Indicators (KPIs), such as resource usage, IO waiting time, etc. A proper analysis of this data, usually stored as time series, can provide insight in choosing the right management strategies as well as the early detection of issues. In this paper, we introduce a methodology to cluster HPC jobs according to their KPI indicators. Our approach reduces the inherent high dimensionality of the collected data by applying two techniques to the time series: literature-based and variance-based feature extraction. We also define a procedure to visualize the obtained clusters by combining the two previous approaches and the Principal Component Analysis (PCA). Finally, we have validated our contributions on a real data set to conclude that those KPIs related to CPU usage provide the best cohesion and separation for clustering analysis and the good results of our visualization methodology.

Related papers

Sliding Window Informative Canonical Correlation Analysis [0.0]
Canonical correlation analysis (CCA) is a technique for finding correlated sets of features between two datasets.<n>We propose a novel extension of CCA to the online, streaming data setting: Sliding Window Informative Canonical Correlation Analysis (SWICCA)
arXiv Detail & Related papers (2025-07-23T20:35:15Z)
Efficient Conformance Checking of Rich Data-Aware Declare Specifications (Extended) [49.46686813437884]
We show that it is possible to compute data-aware optimal alignments in a rich setting with general data types and data conditions.<n>This is achieved by carefully combining the two best-known approaches to deal with control flow and data dependencies.
arXiv Detail & Related papers (2025-06-30T10:16:21Z)
Unsupervised KPIs-Based Clustering of Jobs in HPC Data Centers [0.0]
Key Performance Indicators (KPIs) generate a huge number of monitoring tasks that give data about CPU usage, memory usage, network traffic, or other sensors that monitor hardware. The main contribution in this paper is to identify which metric/s (KPIs) is/are the most appropriate to identify/classify different types of jobs according to their behavior in the HPC system. We have concluded that (i. those metrics (KPIs) related to the Network (interface) traffic monitoring provide the best cohesion and separation to cluster HPC jobs, and (ii. hierarchical clustering algorithms are the most suitable for this task
arXiv Detail & Related papers (2023-12-11T17:31:46Z)
Embedding in Recommender Systems: A Survey [54.55152033023537]
This survey presents a comprehensive analysis of advances in recommender system embedding techniques.<n>In matrix-based scenarios, collaborative filtering generates embeddings that effectively model user-item preferences.<n>We introduce emerging approaches, including AutoML, hashing techniques, and quantization methods, to enhance performance.
arXiv Detail & Related papers (2023-10-28T06:31:06Z)
QBSD: Quartile-Based Seasonality Decomposition for Cost-Effective RAN KPI Forecasting [0.18416014644193066]
We introduce QBSD, a live single-step forecasting approach tailored to optimize the trade-off between accuracy and computational complexity. QBSD has shown significant success with our real network RAN datasets of over several thousand cells. Results demonstrate that the proposed method excels in runtime efficiency compared to the leading algorithms available.
arXiv Detail & Related papers (2023-06-09T15:59:27Z)
Towards High-Performance Exploratory Data Analysis (EDA) Via Stable Equilibrium Point [5.825190876052149]
We introduce a stable equilibrium point (SEP) - based framework for improving the efficiency and solution quality of EDA. A very unique property of the proposed method is that the SEPs will directly encode the clustering properties of data sets.
arXiv Detail & Related papers (2023-06-07T13:31:57Z)
Federated Stochastic Gradient Descent Begets Self-Induced Momentum [151.4322255230084]
Federated learning (FL) is an emerging machine learning method that can be applied in mobile edge systems. We show that running to the gradient descent (SGD) in such a setting can be viewed as adding a momentum-like term to the global aggregation process.
arXiv Detail & Related papers (2022-02-17T02:01:37Z)
Reinforcement Learning with Heterogeneous Data: Estimation and Inference [84.72174994749305]
We introduce the K-Heterogeneous Markov Decision Process (K-Hetero MDP) to address sequential decision problems with population heterogeneity. We propose the Auto-Clustered Policy Evaluation (ACPE) for estimating the value of a given policy, and the Auto-Clustered Policy Iteration (ACPI) for estimating the optimal policy in a given policy class. We present simulations to support our theoretical findings, and we conduct an empirical study on the standard MIMIC-III dataset.
arXiv Detail & Related papers (2022-01-31T20:58:47Z)
Spatial-Spectral Clustering with Anchor Graph for Hyperspectral Image [88.60285937702304]
This paper proposes a novel unsupervised approach called spatial-spectral clustering with anchor graph (SSCAG) for HSI data clustering. The proposed SSCAG is competitive against the state-of-the-art approaches.
arXiv Detail & Related papers (2021-04-24T08:09:27Z)
Adversarial Feature Augmentation and Normalization for Visual Recognition [109.6834687220478]
Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models. Here, we present an effective and efficient alternative that advocates adversarial augmentation on intermediate feature embeddings. We validate the proposed approach across diverse visual recognition tasks with representative backbone networks.
arXiv Detail & Related papers (2021-03-22T20:36:34Z)
Cross-Gradient Aggregation for Decentralized Learning from Non-IID data [34.23789472226752]
Decentralized learning enables a group of collaborative agents to learn models using a distributed dataset without the need for a central parameter server. We propose Cross-Gradient Aggregation (CGA), a novel decentralized learning algorithm. We show superior learning performance of CGA over existing state-of-the-art decentralized learning algorithms.
arXiv Detail & Related papers (2021-03-02T21:58:12Z)
Correlation-wise Smoothing: Lightweight Knowledge Extraction for HPC Monitoring Data [1.802439717192088]
We propose a novel method, called Correlation-wise Smoothing (CS), to extract descriptive signatures from time-series monitoring data. Our method exploits correlations between data dimensions to form groups and produces image-like signatures that can be easily manipulated, visualized and compared. We evaluate the CS method on HPC-ODA, a collection of datasets that we release with this work, and show that it leads to the same performance as most state-of-the-art methods.
arXiv Detail & Related papers (2020-10-13T05:22:47Z)
Topology-based Clusterwise Regression for User Segmentation and Demand Forecasting [63.78344280962136]
Using a public and a novel proprietary data set of commercial data, this research shows that the proposed system enables analysts to both cluster their user base and plan demand at a granular level. This work seeks to introduce TDA-based clustering of time series and clusterwise regression with matrix factorization methods as viable tools for the practitioner.
arXiv Detail & Related papers (2020-09-08T12:10:10Z)
Superiority of Simplicity: A Lightweight Model for Network Device Workload Prediction [58.98112070128482]
We propose a lightweight solution for series prediction based on historic observations. It consists of a heterogeneous ensemble method composed of two models - a neural network and a mean predictor. It achieves an overall $R2$ score of 0.10 on the available FedCSIS 2020 challenge dataset.
arXiv Detail & Related papers (2020-07-07T15:44:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.