Intelligent Proactive Fault Tolerance at the Edge through Resource Usage
Prediction
- URL: http://arxiv.org/abs/2302.05336v1
- Date: Thu, 9 Feb 2023 00:42:34 GMT
- Title: Intelligent Proactive Fault Tolerance at the Edge through Resource Usage
Prediction
- Authors: Theodoros Theodoropoulos, John Violos, Stylianos Tsanakas, Aris
Leivadeas, Konstantinos Tserpes, Theodora Varvarigou
- Abstract summary: We propose an Intelligent Proactive Fault Tolerance (IPFT) method that leverages the edge resource usage predictions through Recurrent Neural Networks (RNN)
In this paper, we focus on the process-faults, which are related with the inability of the infrastructure to provide Quality of Service (QoS) in acceptable ranges due to the lack of processing power.
- Score: 0.7046417074932255
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The proliferation of demanding applications and edge computing establishes
the need for an efficient management of the underlying computing
infrastructures, urging the providers to rethink their operational methods. In
this paper, we propose an Intelligent Proactive Fault Tolerance (IPFT) method
that leverages the edge resource usage predictions through Recurrent Neural
Networks (RNN). More specifically, we focus on the process-faults, which are
related with the inability of the infrastructure to provide Quality of Service
(QoS) in acceptable ranges due to the lack of processing power. In order to
tackle this challenge we propose a composite deep learning architecture that
predicts the resource usage metrics of the edge nodes and triggers proactive
node replications and task migration. Taking also into consideration that the
edge computing infrastructure is also highly dynamic and heterogeneous, we
propose an innovative Hybrid Bayesian Evolution Strategy (HBES) algorithm for
automated adaptation of the resource usage models. The proposed resource usage
prediction mechanism has been experimentally evaluated and compared with other
state of the art methods with significant improvements in terms of Root Mean
Squared Error (RMSE) and Mean Absolute Error (MAE). Additionally, the IPFT
mechanism that leverages the resource usage predictions has been evaluated in
an extensive simulation in CloudSim Plus and the results show significant
improvement compared to the reactive fault tolerance method in terms of
reliability and maintainability.
Related papers
- DNN Partitioning, Task Offloading, and Resource Allocation in Dynamic Vehicular Networks: A Lyapunov-Guided Diffusion-Based Reinforcement Learning Approach [49.56404236394601]
We formulate the problem of joint DNN partitioning, task offloading, and resource allocation in Vehicular Edge Computing.
Our objective is to minimize the DNN-based task completion time while guaranteeing the system stability over time.
We propose a Multi-Agent Diffusion-based Deep Reinforcement Learning (MAD2RL) algorithm, incorporating the innovative use of diffusion models.
arXiv Detail & Related papers (2024-06-11T06:31:03Z) - Enhancing Reliability of Neural Networks at the Edge: Inverted
Normalization with Stochastic Affine Transformations [0.22499166814992438]
We propose a method to inherently enhance the robustness and inference accuracy of BayNNs deployed in in-memory computing architectures.
Empirical results show a graceful degradation in inference accuracy, with an improvement of up to $58.11%$.
arXiv Detail & Related papers (2024-01-23T00:27:31Z) - Hybrid Reinforcement Learning for Optimizing Pump Sustainability in
Real-World Water Distribution Networks [55.591662978280894]
This article addresses the pump-scheduling optimization problem to enhance real-time control of real-world water distribution networks (WDNs)
Our primary objectives are to adhere to physical operational constraints while reducing energy consumption and operational costs.
Traditional optimization techniques, such as evolution-based and genetic algorithms, often fall short due to their lack of convergence guarantees.
arXiv Detail & Related papers (2023-10-13T21:26:16Z) - Compressed Regression over Adaptive Networks [58.79251288443156]
We derive the performance achievable by a network of distributed agents that solve, adaptively and in the presence of communication constraints, a regression problem.
We devise an optimized allocation strategy where the parameters necessary for the optimization can be learned online by the agents.
arXiv Detail & Related papers (2023-04-07T13:41:08Z) - Evolutionary Optimization for Proactive and Dynamic Computing Resource
Allocation in Open Radio Access Network [4.9711284100869815]
Intelligent techniques are urged to achieve automatic allocation of the computing resource in Open Radio Access Network (O-RAN)
Existing problem formulation to solve this resource allocation problem is unsuitable as it defines the capacity utility of resource in an inappropriate way.
New formulation that better describes the problem is proposed.
arXiv Detail & Related papers (2022-01-12T08:52:04Z) - CLARA: A Constrained Reinforcement Learning Based Resource Allocation
Framework for Network Slicing [19.990451009223573]
Network slicing is proposed as a promising solution for resource utilization in 5G and future networks.
We formulate the problem as a Constrained Markov Decision Process (CMDP) without knowing models and hidden structures.
We propose to solve the problem using CLARA, a Constrained reinforcement LeArning based Resource Allocation algorithm.
arXiv Detail & Related papers (2021-11-16T11:54:09Z) - On Efficient Uncertainty Estimation for Resource-Constrained Mobile
Applications [0.0]
Predictive uncertainty supplements model predictions and enables improved functionality of downstream tasks.
We tackle this problem by building upon Monte Carlo Dropout (MCDO) models using the Axolotl framework.
We conduct experiments on (1) a multi-class classification task using the CIFAR10 dataset, and (2) a more complex human body segmentation task.
arXiv Detail & Related papers (2021-11-11T22:24:15Z) - Robust lEarned Shrinkage-Thresholding (REST): Robust unrolling for
sparse recover [87.28082715343896]
We consider deep neural networks for solving inverse problems that are robust to forward model mis-specifications.
We design a new robust deep neural network architecture by applying algorithm unfolding techniques to a robust version of the underlying recovery problem.
The proposed REST network is shown to outperform state-of-the-art model-based and data-driven algorithms in both compressive sensing and radar imaging problems.
arXiv Detail & Related papers (2021-10-20T06:15:45Z) - Reconfigurable Intelligent Surface Assisted Mobile Edge Computing with
Heterogeneous Learning Tasks [53.1636151439562]
Mobile edge computing (MEC) provides a natural platform for AI applications.
We present an infrastructure to perform machine learning tasks at an MEC with the assistance of a reconfigurable intelligent surface (RIS)
Specifically, we minimize the learning error of all participating users by jointly optimizing transmit power of mobile users, beamforming vectors of the base station, and the phase-shift matrix of the RIS.
arXiv Detail & Related papers (2020-12-25T07:08:50Z) - Resource Allocation via Model-Free Deep Learning in Free Space Optical
Communications [119.81868223344173]
The paper investigates the general problem of resource allocation for mitigating channel fading effects in Free Space Optical (FSO) communications.
Under this framework, we propose two algorithms that solve FSO resource allocation problems.
arXiv Detail & Related papers (2020-07-27T17:38:51Z) - Deep Learning-based Resource Allocation for Infrastructure Resilience [0.5249805590164901]
Decision-makers can use our trained models to allocate resources more efficiently after contingencies.
We showcase our methodology by the real-world interdependent infrastructure of Shelby County, TN.
arXiv Detail & Related papers (2020-07-12T00:48:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.