Related papers: Employing Software Diversity in Cloud Microservices to Engineer Reliable and Performant Systems

Employing Software Diversity in Cloud Microservices to Engineer Reliable and Performant Systems

URL: http://arxiv.org/abs/2407.07287v1
Date: Wed, 10 Jul 2024 00:34:39 GMT
Title: Employing Software Diversity in Cloud Microservices to Engineer Reliable and Performant Systems
Authors: Nazanin Akhtarian, Hamzeh Khazaei, Marin Litoiu,
Abstract summary: This work proposes employing software diversity to enhance system reliability and performance simultaneously. A cornerstone of our work is the derivation of a reliability metric. The goal is to maintain a higher replica count for more reliable versions while preserving the diversity of versions as much as possible.
Score: 2.412158290827225
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In the ever-shifting landscape of software engineering, we recognize the need for adaptation and evolution to maintain system dependability. As each software iteration potentially introduces new challenges, from unforeseen bugs to performance anomalies, it becomes paramount to understand and address these intricacies to ensure robust system operations during the lifetime. This work proposes employing software diversity to enhance system reliability and performance simultaneously. A cornerstone of our work is the derivation of a reliability metric. This metric encapsulates the reliability and performance of each software version under adverse conditions. Using the calculated reliability score, we implemented a dynamic controller responsible for adjusting the population of each software version. The goal is to maintain a higher replica count for more reliable versions while preserving the diversity of versions as much as possible. This balance is crucial for ensuring not only the reliability but also the performance of the system against a spectrum of potential failures. In addition, we designed and implemented a diversity-aware autoscaling algorithm that maintains the reliability and performance of the system at the same time and at any scale. Our extensive experiments on realistic cloud microservice-based applications show the effectiveness of the proposed approach in this paper in promoting both reliability and performance.

Related papers

Towards a Robust Quality Assurance Framework for Cloud Computing Environments [0.0]
Current QA frameworks are poorly defined, often not automated, and lack the flexibility needed for on-demand, cloud based environments. This paper presents a detailed framework for QA in cloud computing systems and advocates for standardized, automated, and adaptable systems.
arXiv Detail & Related papers (2025-02-19T08:29:24Z)
FRAMER/Miu: Tagged Pointer-based Capability and Fundamental Cost of Memory Safety & Coherence (Position Paper) [0.0]
Researchers make trade-offs between performance, detection coverage, interoperability, precision, and detection timing. This research presents a tagged pointer-based capability system as a stand-alone software solution and a prototype for future hardware design.
arXiv Detail & Related papers (2024-08-27T17:31:26Z)
Agent-Driven Automatic Software Improvement [55.2480439325792]
This research proposal aims to explore innovative solutions by focusing on the deployment of agents powered by Large Language Models (LLMs) The iterative nature of agents, which allows for continuous learning and adaptation, can help surpass common challenges in code generation. We aim to use the iterative feedback in these systems to further fine-tune the LLMs underlying the agents, becoming better aligned to the task of automated software improvement.
arXiv Detail & Related papers (2024-06-24T15:45:22Z)
Using rule engine in self-healing systems and MAPE model [0.0]
This study presents a failure repair method that uses a rule engine. The simulation on mRUBIS showed that the proposed method could be efficient in the operational environment. This, in turn, reduces the repercussions of failures and cultivates increased confidence in digital technologies.
arXiv Detail & Related papers (2024-02-18T13:03:11Z)
Modelling Open-Source Software Reliability Incorporating Swarm Intelligence-Based Techniques [0.0]
In the software industry, two software engineering best practices coexist: open-source and closed-source software. Applying meta-heuristic optimization algorithms for closed-source software reliability prediction has produced significant and accurate results. Results on open-source software reliability - as a quality indicator - would greatly help solve the open-source software reliability growth-modelling problem.
arXiv Detail & Related papers (2024-01-05T06:46:03Z)
A Holistic Assessment of the Reliability of Machine Learning Systems [30.638615396429536]
This paper proposes a holistic assessment methodology for the reliability of machine learning (ML) systems. Our framework evaluates five key properties: in-distribution accuracy, distribution-shift robustness, adversarial robustness, calibration, and out-of-distribution detection. To provide insights into the performance of different algorithmic approaches, we identify and categorize state-of-the-art techniques.
arXiv Detail & Related papers (2023-07-20T05:00:13Z)
Did You Mean...? Confidence-based Trade-offs in Semantic Parsing [52.28988386710333]
We show how a calibrated model can help balance common trade-offs in task-oriented parsing. We then examine how confidence scores can help optimize the trade-off between usability and safety.
arXiv Detail & Related papers (2023-03-29T17:07:26Z)
MMRNet: Improving Reliability for Multimodal Object Detection and Segmentation for Bin Picking via Multimodal Redundancy [68.7563053122698]
We propose a reliable object detection and segmentation system with MultiModal Redundancy (MMRNet) This is the first system that introduces the concept of multimodal redundancy to address sensor failure issues during deployment. We present a new label-free multi-modal consistency (MC) score that utilizes the output from all modalities to measure the overall system output reliability and uncertainty.
arXiv Detail & Related papers (2022-10-19T19:15:07Z)
PixMix: Dreamlike Pictures Comprehensively Improve Safety Measures [65.36234499099294]
We propose a new data augmentation strategy utilizing the natural structural complexity of pictures such as fractals. To meet this challenge, we design a new data augmentation strategy utilizing the natural structural complexity of pictures such as fractals.
arXiv Detail & Related papers (2021-12-09T18:59:31Z)
Probabilistic robust linear quadratic regulators with Gaussian processes [73.0364959221845]
Probabilistic models such as Gaussian processes (GPs) are powerful tools to learn unknown dynamical systems from data for subsequent use in control design. We present a novel controller synthesis for linearized GP dynamics that yields robust controllers with respect to a probabilistic stability margin.
arXiv Detail & Related papers (2021-05-17T08:36:18Z)
Efficient Empowerment Estimation for Unsupervised Stabilization [75.32013242448151]
empowerment principle enables unsupervised stabilization of dynamical systems at upright positions. We propose an alternative solution based on a trainable representation of a dynamical system as a Gaussian channel. We show that our method has a lower sample complexity, is more stable in training, possesses the essential properties of the empowerment function, and allows estimation of empowerment from images.
arXiv Detail & Related papers (2020-07-14T21:10:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.