Towards Learned Predictability of Storage Systems
- URL: http://arxiv.org/abs/2307.16288v1
- Date: Sun, 30 Jul 2023 17:53:08 GMT
- Title: Towards Learned Predictability of Storage Systems
- Authors: Chenyuan Wu
- Abstract summary: Storage systems have become a fundamental building block of datacenters.
Despite the growing popularity and interests in storage, designing and implementing reliable storage systems remains challenging.
To move towards predictability of storage systems, various mechanisms and field studies have been proposed in the past few years.
Based on three representative research works, we discuss where and how machine learning should be applied in this field.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the rapid development of cloud computing and big data technologies,
storage systems have become a fundamental building block of datacenters,
incorporating hardware innovations such as flash solid state drives and
non-volatile memories, as well as software infrastructures such as RAID and
distributed file systems. Despite the growing popularity and interests in
storage, designing and implementing reliable storage systems remains
challenging, due to their performance instability and prevailing hardware
failures.
Proactive prediction greatly strengthens the reliability of storage systems.
There are two dimensions of prediction: performance and failure. Ideally,
through detecting in advance the slow IO requests, and predicting device
failures before they really happen, we can build storage systems with
especially low tail latency and high availability. While its importance is well
recognized, such proactive prediction in storage systems, on the other hand, is
particularly difficult. To move towards predictability of storage systems,
various mechanisms and field studies have been proposed in the past few years.
In this report, we present a survey of these mechanisms and field studies,
focusing on machine learning based black-box approaches. Based on three
representative research works, we discuss where and how machine learning should
be applied in this field. The strengths and limitations of each research work
are also evaluated in detail.
Related papers
- Reproduction Research of FSA-Benchmark [0.0]
Failure-slow disks experience a gradual decline in performance before ultimately failing.
Unlike outright disk failures, fail-slow conditions can go undetected for prolonged periods, leading to considerable impacts on system performance and user experience.
arXiv Detail & Related papers (2024-12-12T01:31:11Z) - Towards Weaknesses and Attack Patterns Prediction for IoT Devices [7.661561516558234]
This paper presents a cost-efficient platform to facilitate the pre-deployment security checks of IoT devices.
The platform employs a Bidirectional Long Short-Term Memory (Bi-LSTM) network to analyse device-related textual data and predict weaknesses.
At the same time, a Gradient Boosting Machine (GBM) model predicts likely attack patterns that could exploit these weaknesses.
arXiv Detail & Related papers (2024-08-23T15:43:51Z) - Design and Implementation of an Automated Disaster-recovery System for a
Kubernetes Cluster Using LSTM [0.0]
This study introduces a system structure that integrates management plat-forms with backup and restoration tools.
The experimental results show that this system executes the restoration process within 15 s without human intervention, enabling rapid recovery.
arXiv Detail & Related papers (2024-02-05T12:00:31Z) - Leveraging the structure of dynamical systems for data-driven modeling [111.45324708884813]
We consider the impact of the training set and its structure on the quality of the long-term prediction.
We show how an informed design of the training set, based on invariants of the system and the structure of the underlying attractor, significantly improves the resulting models.
arXiv Detail & Related papers (2021-12-15T20:09:20Z) - Multi Agent System for Machine Learning Under Uncertainty in Cyber
Physical Manufacturing System [78.60415450507706]
Recent advancements in predictive machine learning has led to its application in various use cases in manufacturing.
Most research focused on maximising predictive accuracy without addressing the uncertainty associated with it.
In this paper, we determine the sources of uncertainty in machine learning and establish the success criteria of a machine learning system to function well under uncertainty.
arXiv Detail & Related papers (2021-07-28T10:28:05Z) - Federated Learning for Intrusion Detection System: Concepts, Challenges
and Future Directions [0.20236506875465865]
Intrusion detection systems play a significant role in ensuring security and privacy of smart devices.
The present paper aims to present an extensive and exhaustive review on the use of FL in intrusion detection system.
arXiv Detail & Related papers (2021-06-16T13:13:04Z) - Large-scale memory failure prediction using mcelog-based Data Mining and
Machine Learning [0.0]
In the data center, unexpected downtime caused by memory failures can lead to a decline in the stability of the server.
This paper compares and summarizes some commonly used skills and the improvement they can bring.
The single model we proposed won the top 15th in the 2nd Alibaba Cloud AIOps Competition.
arXiv Detail & Related papers (2021-04-24T11:38:05Z) - Robust and Transferable Anomaly Detection in Log Data using Pre-Trained
Language Models [59.04636530383049]
Anomalies or failures in large computer systems, such as the cloud, have an impact on a large number of users.
We propose a framework for anomaly detection in log data, as a major troubleshooting source of system information.
arXiv Detail & Related papers (2021-02-23T09:17:05Z) - Online detection of failures generated by storage simulator [2.3859858429583665]
We create a Go-based (golang) package for simulating the behavior of modern storage infrastructure.
The package's flexible structure allows us to create a model of a real-world storage system with a number of components.
To discover failures in the time series distribution generated by the simulator, we modified a change point detection algorithm that works in online mode.
arXiv Detail & Related papers (2021-01-18T14:56:53Z) - Dos and Don'ts of Machine Learning in Computer Security [74.1816306998445]
Despite great potential, machine learning in security is prone to subtle pitfalls that undermine its performance.
We identify common pitfalls in the design, implementation, and evaluation of learning-based security systems.
We propose actionable recommendations to support researchers in avoiding or mitigating the pitfalls where possible.
arXiv Detail & Related papers (2020-10-19T13:09:31Z) - Predictive Maintenance for Edge-Based Sensor Networks: A Deep
Reinforcement Learning Approach [68.40429597811071]
The risk of unplanned equipment downtime can be minimized through Predictive Maintenance of revenue generating assets.
A model-free Deep Reinforcement Learning algorithm is proposed for predictive equipment maintenance from an equipment-based sensor network context.
Unlike traditional black-box regression models, the proposed algorithm self-learns an optimal maintenance policy and provides actionable recommendation for each equipment.
arXiv Detail & Related papers (2020-07-07T10:00:32Z) - Data Mining with Big Data in Intrusion Detection Systems: A Systematic
Literature Review [68.15472610671748]
Cloud computing has become a powerful and indispensable technology for complex, high performance and scalable computation.
The rapid rate and volume of data creation has begun to pose significant challenges for data management and security.
The design and deployment of intrusion detection systems (IDS) in the big data setting has, therefore, become a topic of importance.
arXiv Detail & Related papers (2020-05-23T20:57:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.