Understanding Container-based Services under Software Aging:
Dependability and Performance Views
- URL: http://arxiv.org/abs/2308.12784v1
- Date: Thu, 24 Aug 2023 13:40:26 GMT
- Title: Understanding Container-based Services under Software Aging:
Dependability and Performance Views
- Authors: Jing Bai, Xiaolin Chang, Fumio Machida, Kishor S. Trivedi
- Abstract summary: We show the optimal con-tainer-migration trigger intervals that can maximize the de-pendability or minimize the performance of a container-based service.
This paper proposes a comprehensive semi-Markov-based approach to quantitatively evaluate the effect of OS reju-venation on the dependability and the performance of a con-tainer-based service.
- Score: 5.2135218089240185
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Container technology, as the key enabler behind microservice architectures,
is widely applied in Cloud and Edge Computing. A long and continuous running of
operating system (OS) host-ing container-based services can encounter software
aging that leads to performance deterioration and even causes system fail-ures.
OS rejuvenation techniques can mitigate the impact of software aging but the
rejuvenation trigger interval needs to be carefully determined to reduce the
downtime cost due to rejuve-nation. This paper proposes a comprehensive
semi-Markov-based approach to quantitatively evaluate the effect of OS
reju-venation on the dependability and the performance of a con-tainer-based
service. In contrast to the existing studies, we nei-ther restrict the
distributions of time intervals of events to be exponential nor assume that
backup resources are always avail-able. Through the numerical study, we show
the optimal con-tainer-migration trigger intervals that can maximize the
de-pendability or minimize the performance of a container-based service.
Related papers
- SafeTail: Efficient Tail Latency Optimization in Edge Service Scheduling via Computational Redundancy Management [2.707215971599082]
Emerging applications, such as augmented reality, require low-latency computing services with high reliability on user devices.
We introduce SafeTail, a framework that meets both median and tail response time targets, with tail latency defined as latency beyond the 90th percentile threshold.
arXiv Detail & Related papers (2024-08-30T10:17:37Z) - The Fusion of Deep Reinforcement Learning and Edge Computing for Real-time Monitoring and Control Optimization in IoT Environments [2.0380092516669235]
This paper proposes an optimization control system based on deep reinforcement learning and edge computing.
Results demonstrate that this approach reduces cloud-edge communication latency, accelerates response to abnormal situations, reduces system failure rates, extends average equipment operating time, and saves costs for manual maintenance and replacement.
arXiv Detail & Related papers (2024-02-28T12:01:06Z) - Client Orchestration and Cost-Efficient Joint Optimization for
NOMA-Enabled Hierarchical Federated Learning [55.49099125128281]
We propose a non-orthogonal multiple access (NOMA) enabled HFL system under semi-synchronous cloud model aggregation.
We show that the proposed scheme outperforms the considered benchmarks regarding HFL performance improvement and total cost reduction.
arXiv Detail & Related papers (2023-11-03T13:34:44Z) - TranDRL: A Transformer-Driven Deep Reinforcement Learning Enabled Prescriptive Maintenance Framework [58.474610046294856]
Industrial systems demand reliable predictive maintenance strategies to enhance operational efficiency and reduce downtime.
This paper introduces an integrated framework that leverages the capabilities of the Transformer model-based neural networks and deep reinforcement learning (DRL) algorithms to optimize system maintenance actions.
arXiv Detail & Related papers (2023-09-29T02:27:54Z) - DeepScaler: Holistic Autoscaling for Microservices Based on
Spatiotemporal GNN with Adaptive Graph Learning [4.128665560397244]
This paper presents DeepScaler, a deep learning-based holistic autoscaling approach.
It focuses on coping with service dependencies to optimize service-level agreements (SLA) assurance and cost efficiency.
Experimental results demonstrate that our method implements a more effective autoscaling mechanism for microservice.
arXiv Detail & Related papers (2023-09-02T08:22:21Z) - Dynamic Scheduling for Federated Edge Learning with Streaming Data [56.91063444859008]
We consider a Federated Edge Learning (FEEL) system where training data are randomly generated over time at a set of distributed edge devices with long-term energy constraints.
Due to limited communication resources and latency requirements, only a subset of devices is scheduled for participating in the local training process in every iteration.
arXiv Detail & Related papers (2023-05-02T07:41:16Z) - Time-sensitive Learning for Heterogeneous Federated Edge Intelligence [52.83633954857744]
We investigate real-time machine learning in a federated edge intelligence (FEI) system.
FEI systems exhibit heterogenous communication and computational resource distribution.
We propose a time-sensitive federated learning (TS-FL) framework to minimize the overall run-time for collaboratively training a shared ML model.
arXiv Detail & Related papers (2023-01-26T08:13:22Z) - Innovations in the field of on-board scheduling technologies [64.41511459132334]
This paper proposes an onboard scheduler, that integrates inside an onboard software framework for mission autonomy.
The scheduler is based on linear integer programming and relies on the use of a branch-and-cut solver.
The technology has been tested on an Earth Observation scenario, comparing its performance against the state-of-the-art scheduling technology.
arXiv Detail & Related papers (2022-05-04T12:00:49Z) - CAROL: Confidence-Aware Resilience Model for Edge Federations [13.864161788250856]
We present a confidence aware resilience model, CAROL, that utilizes a memory-efficient generative neural network to predict the Quality of Service (QoS) for a future state and a confidence score for each prediction.
CAROL outperforms state-of-the-art resilience schemes by reducing the energy consumption, deadline violation rates and resilience overheads by up to 16, 17 and 36 percent, respectively.
arXiv Detail & Related papers (2022-03-14T14:37:31Z) - Queue-Learning: A Reinforcement Learning Approach for Providing Quality
of Service [1.8477401359673706]
Servicerate control is a common mechanism for providing guarantees in service systems.
In this paper, we introduce a reinforcement learning-based (RL-based) service-rate controller.
Our controller provides explicit probabilistic guarantees on the end-to-end delay of the system.
arXiv Detail & Related papers (2021-01-12T17:28:57Z) - Untangling tradeoffs between recurrence and self-attention in neural
networks [81.30894993852813]
We present a formal analysis of how self-attention affects gradient propagation in recurrent networks.
We prove that it mitigates the problem of vanishing gradients when trying to capture long-term dependencies.
We propose a relevancy screening mechanism that allows for a scalable use of sparse self-attention with recurrence.
arXiv Detail & Related papers (2020-06-16T19:24:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.