Related papers: Benchmarking Dynamic SLO Compliance in Distributed Computing Continuum Systems

Benchmarking Dynamic SLO Compliance in Distributed Computing Continuum Systems

URL: http://arxiv.org/abs/2503.03274v1
Date: Wed, 05 Mar 2025 08:56:26 GMT
Title: Benchmarking Dynamic SLO Compliance in Distributed Computing Continuum Systems
Authors: Alfreds Lapkovskis, Boris Sedlak, Sindri Magnússon, Schahram Dustdar, Praveen Kumar Donta,
Abstract summary: Service Level Objectives (SLOs) in large-scale architectures are challenging due to their heterogeneous nature and varying service requirements.<n>We present a benchmark of Active Inference -- an emerging method from neuroscience -- against three established reinforcement learning algorithms.<n>We find that Active Inference is a promising approach for ensuring SLO compliance in DCCS, offering lower memory usage, stable CPU utilization, and fast convergence.
Score: 9.820223170841219
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Ensuring Service Level Objectives (SLOs) in large-scale architectures, such as Distributed Computing Continuum Systems (DCCS), is challenging due to their heterogeneous nature and varying service requirements across different devices and applications. Additionally, unpredictable workloads and resource limitations lead to fluctuating performance and violated SLOs. To improve SLO compliance in DCCS, one possibility is to apply machine learning; however, the design choices are often left to the developer. To that extent, we provide a benchmark of Active Inference -- an emerging method from neuroscience -- against three established reinforcement learning algorithms (Deep Q-Network, Advantage Actor-Critic, and Proximal Policy Optimization). We consider a realistic DCCS use case: an edge device running a video conferencing application alongside a WebSocket server streaming videos. Using one of the respective algorithms, we continuously monitor key performance metrics, such as latency and bandwidth usage, to dynamically adjust parameters -- including the number of streams, frame rate, and resolution -- to optimize service quality and user experience. To test algorithms' adaptability to constant system changes, we simulate dynamically changing SLOs and both instant and gradual data-shift scenarios, such as network bandwidth limitations and fluctuating device thermal states. Although the evaluated algorithms all showed advantages and limitations, our findings demonstrate that Active Inference is a promising approach for ensuring SLO compliance in DCCS, offering lower memory usage, stable CPU utilization, and fast convergence.

Related papers

Transformer-Empowered Actor-Critic Reinforcement Learning for Sequence-Aware Service Function Chain Partitioning [1.9120720496423733]
We introduce a Transformer-empowered actor-critic framework specifically designed for sequence-aware SFC partitioning. Our approach effectively models complex inter-dependencies among VNFs, facilitating coordinated and parallelized decision-making processes.
arXiv Detail & Related papers (2025-04-26T12:18:57Z)
EdgeMLBalancer: A Self-Adaptive Approach for Dynamic Model Switching on Resource-Constrained Edge Devices [0.0]
Machine learning on edge devices has enabled real-time AI applications in resource-constrained environments.<n>Existing solutions for managing computational resources often focus narrowly on accuracy or energy efficiency.<n>We propose a self-adaptive approach that optimize CPU utilization and resource management on edge devices.
arXiv Detail & Related papers (2025-02-10T14:11:29Z)
Split Learning in Computer Vision for Semantic Segmentation Delay Minimization [25.0679083637967]
We propose a novel approach to minimize the inference delay in semantic segmentation using split learning (SL)<n>SL is tailored to the needs of real-time computer vision (CV) applications for resource-constrained devices.
arXiv Detail & Related papers (2024-12-18T19:07:25Z)
Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge. Existing methods struggle to balance high model performance with low resource consumption. We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z)
Differentiable Discrete Event Simulation for Queuing Network Control [7.965453961211742]
Queueing network control poses distinct challenges, including highity, large state and action spaces, and lack of stability. We propose a scalable framework for policy optimization based on differentiable discrete event simulation. Our methods can flexibly handle realistic scenarios, including systems operating in non-stationary environments.
arXiv Detail & Related papers (2024-09-05T17:53:54Z)
Offloading and Quality Control for AI Generated Content Services in 6G Mobile Edge Computing Networks [18.723955271182007]
This paper proposes a joint optimization algorithm for offloading decisions, computation time, and diffusion steps of the diffusion models in the reverse diffusion stage. Experimental results conclusively demonstrate that the proposed algorithm achieves superior joint optimization performance compared to the baselines.
arXiv Detail & Related papers (2023-12-11T08:36:27Z)
MARLIN: Soft Actor-Critic based Reinforcement Learning for Congestion Control in Real Networks [63.24965775030673]
We propose a novel Reinforcement Learning (RL) approach to design generic Congestion Control (CC) algorithms. Our solution, MARLIN, uses the Soft Actor-Critic algorithm to maximize both entropy and return. We trained MARLIN on a real network with varying background traffic patterns to overcome the sim-to-real mismatch.
arXiv Detail & Related papers (2023-02-02T18:27:20Z)
Fluid Batching: Exit-Aware Preemptive Serving of Early-Exit Neural Networks on Edge NPUs [74.83613252825754]
"smart ecosystems" are being formed where sensing happens concurrently rather than standalone. This is shifting the on-device inference paradigm towards deploying neural processing units (NPUs) at the edge. We propose a novel early-exit scheduling that allows preemption at run time to account for the dynamicity introduced by the arrival and exiting processes.
arXiv Detail & Related papers (2022-09-27T15:04:01Z)
Implementing Reinforcement Learning Datacenter Congestion Control in NVIDIA NICs [64.26714148634228]
congestion control (CC) algorithms become extremely difficult to design. It is currently not possible to deploy AI models on network devices due to their limited computational capabilities. We build a computationally-light solution based on a recent reinforcement learning CC algorithm.
arXiv Detail & Related papers (2022-07-05T20:42:24Z)
An Adaptive Device-Edge Co-Inference Framework Based on Soft Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices. We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations. Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z)
Learning to Continuously Optimize Wireless Resource in a Dynamic Environment: A Bilevel Optimization Perspective [52.497514255040514]
This work develops a new approach that enables data-driven methods to continuously learn and optimize resource allocation strategies in a dynamic environment. We propose to build the notion of continual learning into wireless system design, so that the learning model can incrementally adapt to the new episodes. Our design is based on a novel bilevel optimization formulation which ensures certain fairness" across different data samples.
arXiv Detail & Related papers (2021-05-03T07:23:39Z)
Learning to Continuously Optimize Wireless Resource In Episodically Dynamic Environment [55.91291559442884]
This work develops a methodology that enables data-driven methods to continuously learn and optimize in a dynamic environment. We propose to build the notion of continual learning into the modeling process of learning wireless systems. Our design is based on a novel min-max formulation which ensures certain fairness" across different data samples.
arXiv Detail & Related papers (2020-11-16T08:24:34Z)
Reinforcement Learning on Computational Resource Allocation of Cloud-based Wireless Networks [22.06811314358283]
Wireless networks used for Internet of Things (IoT) are expected to largely involve cloud-based computing and processing. In a cloud environment, dynamic computational resource allocation is essential to save energy while maintaining the performance of the processes. This paper models this dynamic computational resource allocation problem into a Markov Decision Process (MDP) and designs a model-based reinforcement-learning agent to optimise the dynamic resource allocation of the CPU usage. The results show that our agent rapidly converges to the optimal policy, stably performs in different settings, outperforms or at least equally performs compared to a baseline algorithm in energy savings for different scenarios.
arXiv Detail & Related papers (2020-10-10T15:16:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.