CAROL: Confidence-Aware Resilience Model for Edge Federations
- URL: http://arxiv.org/abs/2203.07140v1
- Date: Mon, 14 Mar 2022 14:37:31 GMT
- Title: CAROL: Confidence-Aware Resilience Model for Edge Federations
- Authors: Shreshth Tuli, Giuliano Casale and Nicholas R. Jennings
- Abstract summary: We present a confidence aware resilience model, CAROL, that utilizes a memory-efficient generative neural network to predict the Quality of Service (QoS) for a future state and a confidence score for each prediction.
CAROL outperforms state-of-the-art resilience schemes by reducing the energy consumption, deadline violation rates and resilience overheads by up to 16, 17 and 36 percent, respectively.
- Score: 13.864161788250856
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In recent years, the deployment of large-scale Internet of Things (IoT)
applications has given rise to edge federations that seamlessly interconnect
and leverage resources from multiple edge service providers. The requirement of
supporting both latency-sensitive and compute-intensive IoT tasks necessitates
service resilience, especially for the broker nodes in typical broker-worker
deployment designs. Existing fault-tolerance or resilience schemes often lack
robustness and generalization capability in non-stationary workload settings.
This is typically due to the expensive periodic fine-tuning of models required
to adapt them in dynamic scenarios. To address this, we present a confidence
aware resilience model, CAROL, that utilizes a memory-efficient generative
neural network to predict the Quality of Service (QoS) for a future state and a
confidence score for each prediction. Thus, whenever a broker fails, we quickly
recover the system by executing a local-search over the broker-worker topology
space and optimize future QoS. The confidence score enables us to keep track of
the prediction performance and run parsimonious neural network fine-tuning to
avoid excessive overheads, further improving the QoS of the system. Experiments
on a Raspberry-Pi based edge testbed with IoT benchmark applications show that
CAROL outperforms state-of-the-art resilience schemes by reducing the energy
consumption, deadline violation rates and resilience overheads by up to 16, 17
and 36 percent, respectively.
Related papers
- Towards Resource-Efficient Federated Learning in Industrial IoT for Multivariate Time Series Analysis [50.18156030818883]
Anomaly and missing data constitute a thorny problem in industrial applications.
Deep learning enabled anomaly detection has emerged as a critical direction.
The data collected in edge devices contain user privacy.
arXiv Detail & Related papers (2024-11-06T15:38:31Z) - SafeTail: Efficient Tail Latency Optimization in Edge Service Scheduling via Computational Redundancy Management [2.707215971599082]
Emerging applications, such as augmented reality, require low-latency computing services with high reliability on user devices.
We introduce SafeTail, a framework that meets both median and tail response time targets, with tail latency defined as latency beyond the 90th percentile threshold.
arXiv Detail & Related papers (2024-08-30T10:17:37Z) - Digital Twin-Assisted Data-Driven Optimization for Reliable Edge Caching in Wireless Networks [60.54852710216738]
We introduce a novel digital twin-assisted optimization framework, called D-REC, to ensure reliable caching in nextG wireless networks.
By incorporating reliability modules into a constrained decision process, D-REC can adaptively adjust actions, rewards, and states to comply with advantageous constraints.
arXiv Detail & Related papers (2024-06-29T02:40:28Z) - etuner: A Redundancy-Aware Framework for Efficient Continual Learning Application on Edge Devices [47.365775210055396]
We propose ETuner, an efficient edge continual learning framework that optimize inference accuracy, fine-tuning execution time, and energy efficiency.
Experimental results show that, on average, ETuner reduces overall fine-tuning execution time by 64%, energy consumption by 56%, and improves average inference accuracy by 1.75% over the immediate model fine-tuning approach.
arXiv Detail & Related papers (2024-01-30T02:41:05Z) - Adaptive ResNet Architecture for Distributed Inference in
Resource-Constrained IoT Systems [7.26437825413781]
This paper presents an empirical study that identifies the connections in ResNet that can be dropped without significantly impacting the model's performance.
Our experiments demonstrate that an adaptive ResNet architecture can reduce shared data, energy consumption, and latency throughout the distribution.
arXiv Detail & Related papers (2023-07-21T11:07:21Z) - Adaptive Federated Pruning in Hierarchical Wireless Networks [69.6417645730093]
Federated Learning (FL) is a privacy-preserving distributed learning framework where a server aggregates models updated by multiple devices without accessing their private datasets.
In this paper, we introduce model pruning for HFL in wireless networks to reduce the neural network scale.
We show that our proposed HFL with model pruning achieves similar learning accuracy compared with the HFL without model pruning and reduces about 50 percent communication cost.
arXiv Detail & Related papers (2023-05-15T22:04:49Z) - FIRE: A Failure-Adaptive Reinforcement Learning Framework for Edge Computing Migrations [52.85536740465277]
FIRE is a framework that adapts to rare events by training a RL policy in an edge computing digital twin environment.
We propose ImRE, an importance sampling-based Q-learning algorithm, which samples rare events proportionally to their impact on the value function.
We show that FIRE reduces costs compared to vanilla RL and the greedy baseline in the event of failures.
arXiv Detail & Related papers (2022-09-28T19:49:39Z) - PreGAN: Preemptive Migration Prediction Network for Proactive
Fault-Tolerant Edge Computing [12.215537834860699]
We propose PreGAN, a composite AI model using a Generative Adrial Network (GAN) to predict preemptive migration decisions for proactive fault-tolerance.
PreGAN can outperform state-of-the-art baseline methods in fault-detection, diagnosis and classification, thus achieving high quality of service.
arXiv Detail & Related papers (2021-12-04T09:40:50Z) - Appliance Level Short-term Load Forecasting via Recurrent Neural Network [6.351541960369854]
We present an STLF algorithm for efficiently predicting the power consumption of individual electrical appliances.
The proposed method builds upon a powerful recurrent neural network (RNN) architecture in deep learning.
arXiv Detail & Related papers (2021-11-23T16:56:37Z) - Federated Learning with Unreliable Clients: Performance Analysis and
Mechanism Design [76.29738151117583]
Federated Learning (FL) has become a promising tool for training effective machine learning models among distributed clients.
However, low quality models could be uploaded to the aggregator server by unreliable clients, leading to a degradation or even a collapse of training.
We model these unreliable behaviors of clients and propose a defensive mechanism to mitigate such a security risk.
arXiv Detail & Related papers (2021-05-10T08:02:27Z) - The Benefit of the Doubt: Uncertainty Aware Sensing for Edge Computing
Platforms [10.86298377998459]
We propose an efficient framework for predictive uncertainty estimation in NNs deployed on embedded edge systems.
The framework is built from the ground up to provide predictive uncertainty based only on one forward pass.
Our approach not only obtains robust and accurate uncertainty estimations but also outperforms state-of-the-art methods in terms of systems performance.
arXiv Detail & Related papers (2021-02-11T11:44:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.