Multi-dimensional Autoscaling of Processing Services: A Comparison of Agent-based Methods
- URL: http://arxiv.org/abs/2506.10420v1
- Date: Thu, 12 Jun 2025 07:20:26 GMT
- Title: Multi-dimensional Autoscaling of Processing Services: A Comparison of Agent-based Methods
- Authors: Boris Sedlak, Alireza Furutanpey, Zihang Wang, VĂctor Casamayor Pujol, Schahram Dustdar,
- Abstract summary: This work introduces an agent-based autoscaling framework to maximize requirements fulfillment in constrained environments.<n>We compare four types of scaling agents: Active Inference, Deep Q Network, Analysis of Structural Knowledge, and Deep Active Inference.
- Score: 5.201504495733271
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Edge computing breaks with traditional autoscaling due to strict resource constraints, thus, motivating more flexible scaling behaviors using multiple elasticity dimensions. This work introduces an agent-based autoscaling framework that dynamically adjusts both hardware resources and internal service configurations to maximize requirements fulfillment in constrained environments. We compare four types of scaling agents: Active Inference, Deep Q Network, Analysis of Structural Knowledge, and Deep Active Inference, using two real-world processing services running in parallel: YOLOv8 for visual recognition and OpenCV for QR code detection. Results show all agents achieve acceptable SLO performance with varying convergence patterns. While the Deep Q Network benefits from pre-training, the structural analysis converges quickly, and the deep active inference agent combines theoretical foundations with practical scalability advantages. Our findings provide evidence for the viability of multi-dimensional agent-based autoscaling for edge environments and encourage future work in this research direction.
Related papers
- Thinking vs. Doing: Agents that Reason by Scaling Test-Time Interaction [46.286440953594266]
We propose to scale test-time interaction, an untapped dimension of test-time scaling.<n>We first show that even prompting-based interaction scaling can improve task success on web benchmarks non-trivially.<n>We introduce TTI (Test-Time Interaction), a curriculum-based online reinforcement learning approach that trains agents by adaptively adjusting their rollout lengths.
arXiv Detail & Related papers (2025-06-09T17:50:02Z) - Benchmarking Dynamic SLO Compliance in Distributed Computing Continuum Systems [9.820223170841219]
Service Level Objectives (SLOs) in large-scale architectures are challenging due to their heterogeneous nature and varying service requirements.<n>We present a benchmark of Active Inference -- an emerging method from neuroscience -- against three established reinforcement learning algorithms.<n>We find that Active Inference is a promising approach for ensuring SLO compliance in DCCS, offering lower memory usage, stable CPU utilization, and fast convergence.
arXiv Detail & Related papers (2025-03-05T08:56:26Z) - Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - TCCT-Net: Two-Stream Network Architecture for Fast and Efficient Engagement Estimation via Behavioral Feature Signals [58.865901821451295]
We present a novel two-stream feature fusion "Tensor-Convolution and Convolution-Transformer Network" (TCCT-Net) architecture.
To better learn the meaningful patterns in the temporal-spatial domain, we design a "CT" stream that integrates a hybrid convolutional-transformer.
In parallel, to efficiently extract rich patterns from the temporal-frequency domain, we introduce a "TC" stream that uses Continuous Wavelet Transform (CWT) to represent information in a 2D tensor form.
arXiv Detail & Related papers (2024-04-15T06:01:48Z) - Interactive Autonomous Navigation with Internal State Inference and
Interactivity Estimation [58.21683603243387]
We propose three auxiliary tasks with relational-temporal reasoning and integrate them into the standard Deep Learning framework.
These auxiliary tasks provide additional supervision signals to infer the behavior patterns other interactive agents.
Our approach achieves robust and state-of-the-art performance in terms of standard evaluation metrics.
arXiv Detail & Related papers (2023-11-27T18:57:42Z) - Dynamic Perceiver for Efficient Visual Recognition [87.08210214417309]
We propose Dynamic Perceiver (Dyn-Perceiver) to decouple the feature extraction procedure and the early classification task.
A feature branch serves to extract image features, while a classification branch processes a latent code assigned for classification tasks.
Early exits are placed exclusively within the classification branch, thus eliminating the need for linear separability in low-level features.
arXiv Detail & Related papers (2023-06-20T03:00:22Z) - PSNet: Parallel Symmetric Network for Video Salient Object Detection [85.94443548452729]
We propose a VSOD network with up and down parallel symmetry, named PSNet.
Two parallel branches with different dominant modalities are set to achieve complete video saliency decoding.
arXiv Detail & Related papers (2022-10-12T04:11:48Z) - Complexity-Driven CNN Compression for Resource-constrained Edge AI [1.6114012813668934]
We propose a novel and computationally efficient pruning pipeline by exploiting the inherent layer-level complexities of CNNs.
We define three modes of pruning, namely parameter-aware (PA), FLOPs-aware (FA), and memory-aware (MA), to introduce versatile compression of CNNs.
arXiv Detail & Related papers (2022-08-26T16:01:23Z) - Multi-Exit Semantic Segmentation Networks [78.44441236864057]
We propose a framework for converting state-of-the-art segmentation models to MESS networks.
specially trained CNNs that employ parametrised early exits along their depth to save during inference on easier samples.
We co-optimise the number, placement and architecture of the attached segmentation heads, along with the exit policy, to adapt to the device capabilities and application-specific requirements.
arXiv Detail & Related papers (2021-06-07T11:37:03Z) - Pareto-Optimal Bit Allocation for Collaborative Intelligence [39.11380888887304]
Collaborative intelligence (CI) has emerged as a promising framework for deployment of Artificial Intelligence (AI)-based services on mobile/edge devices.
In this paper, we study bit allocation for feature coding in multi-stream CI systems.
arXiv Detail & Related papers (2020-09-25T20:48:33Z) - Relational-Grid-World: A Novel Relational Reasoning Environment and An
Agent Model for Relational Information Extraction [0.0]
Reinforcement learning (RL) agents are often designed specifically for a particular problem and they generally have uninterpretable working processes.
Statistical methods-based RL algorithms can be improved in terms of generalizability and interpretability using symbolic Artificial Intelligence (AI) tools such as logic programming.
We present a model-free RL architecture that is supported with explicit relational representations of the environmental objects.
arXiv Detail & Related papers (2020-07-12T11:30:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.