Enhancing the Analysis of Software Failures in Cloud Computing Systems
with Deep Learning
- URL: http://arxiv.org/abs/2106.15182v1
- Date: Tue, 29 Jun 2021 09:00:41 GMT
- Title: Enhancing the Analysis of Software Failures in Cloud Computing Systems
with Deep Learning
- Authors: Domenico Cotroneo, Luigi De Simone, Pietro Liguori, Roberto Natella
- Abstract summary: This paper presents a novel approach for analyzing failure data from cloud systems, in order to relieve human analysts from manually fine-tuning the data for feature engineering.
The approach leverages Deep Embedded Clustering (DEC), a family of unsupervised clustering algorithms based on deep learning.
The results show that the performance of the proposed approach, in terms of purity of clusters, is comparable to, or in some cases even better than manually fine-tuned clustering.
- Score: 0.11470070927586014
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Identifying the failure modes of cloud computing systems is a difficult and
time-consuming task, due to the growing complexity of such systems, and the
large volume and noisiness of failure data. This paper presents a novel
approach for analyzing failure data from cloud systems, in order to relieve
human analysts from manually fine-tuning the data for feature engineering. The
approach leverages Deep Embedded Clustering (DEC), a family of unsupervised
clustering algorithms based on deep learning, which uses an autoencoder to
optimize data dimensionality and inter-cluster variance. We applied the
approach in the context of the OpenStack cloud computing platform, both on the
raw failure data and in combination with an anomaly detection pre-processing
algorithm. The results show that the performance of the proposed approach, in
terms of purity of clusters, is comparable to, or in some cases even better
than manually fine-tuned clustering, thus avoiding the need for deep domain
knowledge and reducing the effort to perform the analysis. In all cases, the
proposed approach provides better performance than unsupervised clustering when
no feature engineering is applied to the data. Moreover, the distribution of
failure modes from the proposed approach is closer to the actual frequency of
the failure modes.
Related papers
- Large-scale Fully-Unsupervised Re-Identification [78.47108158030213]
We propose two strategies to learn from large-scale unlabeled data.
The first strategy performs a local neighborhood sampling to reduce the dataset size in each without violating neighborhood relationships.
A second strategy leverages a novel Re-Ranking technique, which has a lower time upper bound complexity and reduces the memory complexity from O(n2) to O(kn) with k n.
arXiv Detail & Related papers (2023-07-26T16:19:19Z) - Hard Regularization to Prevent Deep Online Clustering Collapse without
Data Augmentation [65.268245109828]
Online deep clustering refers to the joint use of a feature extraction network and a clustering model to assign cluster labels to each new data point or batch as it is processed.
While faster and more versatile than offline methods, online clustering can easily reach the collapsed solution where the encoder maps all inputs to the same point and all are put into a single cluster.
We propose a method that does not require data augmentation, and that, differently from existing methods, regularizes the hard assignments.
arXiv Detail & Related papers (2023-03-29T08:23:26Z) - A Deep Dive into Deep Cluster [0.2578242050187029]
DeepCluster is a simple and scalable unsupervised pretraining of visual representations.
We show that DeepCluster convergence and performance depend on the interplay between the quality of the randomly filters of the convolutional layer and the selected number of clusters.
arXiv Detail & Related papers (2022-07-24T22:55:09Z) - KnAC: an approach for enhancing cluster analysis with background
knowledge and explanations [0.20999222360659603]
We present Knowledge Augmented Clustering (KnAC), which main goal is to confront expert-based labelling with automated clustering.
KnAC can serve as an augmentation of an arbitrary clustering algorithm, making the approach robust and model-agnostic.
arXiv Detail & Related papers (2021-12-16T10:13:47Z) - Sample-based and Feature-based Federated Learning via Mini-batch SSCA [18.11773963976481]
This paper investigates sample-based and feature-based federated optimization.
We show that the proposed algorithms can preserve data privacy through the model aggregation mechanism.
We also show that the proposed algorithms converge to Karush-Kuhn-Tucker points of the respective federated optimization problems.
arXiv Detail & Related papers (2021-04-13T08:23:46Z) - Quasi-Global Momentum: Accelerating Decentralized Deep Learning on
Heterogeneous Data [77.88594632644347]
Decentralized training of deep learning models is a key element for enabling data privacy and on-device learning over networks.
In realistic learning scenarios, the presence of heterogeneity across different clients' local datasets poses an optimization challenge.
We propose a novel momentum-based method to mitigate this decentralized training difficulty.
arXiv Detail & Related papers (2021-02-09T11:27:14Z) - Data-Driven Robust Optimization using Unsupervised Deep Learning [0.0]
We show that a trained neural network can be integrated into a robust optimization model by formulating the adversarial problem as a convex mixed-integer program.
We find that this approach outperforms a similar approach using kernel-based support vector sets.
arXiv Detail & Related papers (2020-11-19T11:06:54Z) - Coded Stochastic ADMM for Decentralized Consensus Optimization with Edge
Computing [113.52575069030192]
Big data, including applications with high security requirements, are often collected and stored on multiple heterogeneous devices, such as mobile devices, drones and vehicles.
Due to the limitations of communication costs and security requirements, it is of paramount importance to extract information in a decentralized manner instead of aggregating data to a fusion center.
We consider the problem of learning model parameters in a multi-agent system with data locally processed via distributed edge nodes.
A class of mini-batch alternating direction method of multipliers (ADMM) algorithms is explored to develop the distributed learning model.
arXiv Detail & Related papers (2020-10-02T10:41:59Z) - A black-box adversarial attack for poisoning clustering [78.19784577498031]
We propose a black-box adversarial attack for crafting adversarial samples to test the robustness of clustering algorithms.
We show that our attacks are transferable even against supervised algorithms such as SVMs, random forests, and neural networks.
arXiv Detail & Related papers (2020-09-09T18:19:31Z) - Learning while Respecting Privacy and Robustness to Distributional
Uncertainties and Adversarial Data [66.78671826743884]
The distributionally robust optimization framework is considered for training a parametric model.
The objective is to endow the trained model with robustness against adversarially manipulated input data.
Proposed algorithms offer robustness with little overhead.
arXiv Detail & Related papers (2020-07-07T18:25:25Z) - EXPLAIN-IT: Towards Explainable AI for Unsupervised Network Traffic
Analysis [7.447122949368314]
We introduce EXPLAIN-IT, a methodology which deals with unlabeled data, creates meaningful clusters, and suggests an explanation to the clustering results for the end-user.
We apply EXPLAIN-IT to the problem of YouTube video quality classification under encrypted traffic scenarios, showing promising results.
arXiv Detail & Related papers (2020-03-03T17:54:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.