Related papers: AI-Driven Resource Allocation Framework for Microservices in Hybrid Cloud Platforms

AI-Driven Resource Allocation Framework for Microservices in Hybrid Cloud Platforms

URL: http://arxiv.org/abs/2412.02610v1
Date: Tue, 03 Dec 2024 17:41:08 GMT
Title: AI-Driven Resource Allocation Framework for Microservices in Hybrid Cloud Platforms
Authors: Biman Barua, M. Shamim Kaiser,
Abstract summary: This paper presents an AI-driven framework for resource allocation among in hybrid cloud platforms.<n>The framework employs reinforcement learning (RL)-based resource utilization optimization to reduce costs and improve performance.
Score: 1.03590082373586
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The increasing demand for scalable, efficient resource management in hybrid cloud environments has led to the exploration of AI-driven approaches for dynamic resource allocation. This paper presents an AI-driven framework for resource allocation among microservices in hybrid cloud platforms. The framework employs reinforcement learning (RL)-based resource utilization optimization to reduce costs and improve performance. The framework integrates AI models with cloud management tools to respond to challenges of dynamic scaling and cost-efficient low-latency service delivery. The reinforcement learning model continuously adjusts provisioned resources as required by the microservices and predicts the future consumption trends to minimize both under- and over-provisioning of resources. Preliminary simulation results indicate that using AI in the provision of resources related to costs can reduce expenditure by up to 30-40% compared to manual provisioning and threshold-based auto-scaling approaches. It is also estimated that the efficiency in resource utilization is expected to improve by 20%-30% with a corresponding latency cut of 15%-20% during the peak demand periods. This study compares the AI-driven approach with existing static and rule-based resource allocation methods, demonstrating the capability of this new model to outperform them in terms of flexibility and real-time interests. The results indicate that reinforcement learning can make optimization of hybrid cloud platforms even better, offering a 25-35% improvement in cost efficiency and the power of scaling for microservice-based applications. The proposed framework is a strong and scalable solution to managing cloud resources in dynamic and performance-critical environments.

Related papers

Scalability Optimization in Cloud-Based AI Inference Services: Strategies for Real-Time Load Balancing and Automated Scaling [1.3689475854650441]
This study proposes a comprehensive scalability optimization framework for cloud AI inference services. The proposed model is a hybrid approach that combines reinforcement learning for adaptive load distribution and deep neural networks for accurate demand forecasting. Experimental results demonstrate that the proposed model enhances load balancing efficiency by 35 and reduces response delay by 28.
arXiv Detail & Related papers (2025-04-16T04:00:04Z)
Intelligent Resource Allocation Optimization for Cloud Computing via Machine Learning [11.657154571216234]
This paper proposes an intelligent resource allocation algorithm that leverages deep learning (LSTM) for demand prediction and reinforcement learning (DQN) for dynamic scheduling. The proposed system enhances resource utilization by 32.5%, reduces average response time by 43.3%, and lowers operational costs by 26.6%.
arXiv Detail & Related papers (2025-03-21T23:06:43Z)
Network Resource Optimization for ML-Based UAV Condition Monitoring with Vibration Analysis [54.550658461477106]
Condition Monitoring (CM) uses Machine Learning (ML) models to identify abnormal and adverse conditions. This work explores the optimization of network resources for ML-based UAV CM frameworks. By leveraging dimensionality reduction techniques, there is a 99.9% reduction in network resource consumption.
arXiv Detail & Related papers (2025-02-21T14:36:12Z)
Secure Resource Allocation via Constrained Deep Reinforcement Learning [49.15061461220109]
We present SARMTO, a framework that balances resource allocation, task offloading, security, and performance. SARMTO consistently outperforms five baseline approaches, achieving up to a 40% reduction in system costs. These enhancements highlight SARMTO's potential to revolutionize resource management in intricate distributed computing environments.
arXiv Detail & Related papers (2025-01-20T15:52:43Z)
Online Client Scheduling and Resource Allocation for Efficient Federated Edge Learning [9.451084740123198]
Federated learning (FL) enables edge devices to collaboratively train a machine learning model without sharing their raw data. However, deploying FL over mobile edge networks with constrained resources such as power, bandwidth, and suffers from high training latency and low model accuracy. This paper investigates the optimal client scheduling and resource allocation for FL over mobile edge networks under resource constraints and uncertainty.
arXiv Detail & Related papers (2024-09-29T01:56:45Z)
MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT [87.4910758026772]
"Bigger the better" has been the predominant trend in recent Large Language Models (LLMs) development. This paper explores the "less is more" paradigm by addressing the challenge of designing accurate yet efficient Small Language Models (SLMs) for resource constrained devices.
arXiv Detail & Related papers (2024-02-26T18:59:03Z)
Machine Learning Insides OptVerse AI Solver: Design Principles and Applications [74.67495900436728]
We present a comprehensive study on the integration of machine learning (ML) techniques into Huawei Cloud's OptVerse AI solver. We showcase our methods for generating complex SAT and MILP instances utilizing generative models that mirror multifaceted structures of real-world problem. We detail the incorporation of state-of-the-art parameter tuning algorithms which markedly elevate solver performance.
arXiv Detail & Related papers (2024-01-11T15:02:15Z)
Hybrid Reinforcement Learning for Optimizing Pump Sustainability in Real-World Water Distribution Networks [55.591662978280894]
This article addresses the pump-scheduling optimization problem to enhance real-time control of real-world water distribution networks (WDNs) Our primary objectives are to adhere to physical operational constraints while reducing energy consumption and operational costs. Traditional optimization techniques, such as evolution-based and genetic algorithms, often fall short due to their lack of convergence guarantees.
arXiv Detail & Related papers (2023-10-13T21:26:16Z)
A Cost-Aware Mechanism for Optimized Resource Provisioning in Cloud Computing [6.369406986434764]
We have proposed a novel learning based resource provisioning approach that achieves cost-reduction guarantees of demands. Our method adapts most of the requirements efficiently, and furthermore the resulting performance meets our design goals.
arXiv Detail & Related papers (2023-09-20T13:27:30Z)
Adaptive Resource Allocation for Virtualized Base Stations in O-RAN with Online Learning [60.17407932691429]
Open Radio Access Network systems, with their base stations (vBSs), offer operators the benefits of increased flexibility, reduced costs, vendor diversity, and interoperability. We propose an online learning algorithm that balances the effective throughput and vBS energy consumption, even under unforeseeable and "challenging'' environments. We prove the proposed solutions achieve sub-linear regret, providing zero average optimality gap even in challenging environments.
arXiv Detail & Related papers (2023-09-04T17:30:21Z)
Sustainable AIGC Workload Scheduling of Geo-Distributed Data Centers: A Multi-Agent Reinforcement Learning Approach [48.18355658448509]
Recent breakthroughs in generative artificial intelligence have triggered a surge in demand for machine learning training, which poses significant cost burdens and environmental challenges due to its substantial energy consumption. Scheduling training jobs among geographically distributed cloud data centers unveils the opportunity to optimize the usage of computing capacity powered by inexpensive and low-carbon energy. We propose an algorithm based on multi-agent reinforcement learning and actor-critic methods to learn the optimal collaborative scheduling strategy through interacting with a cloud system built with real-life workload patterns, energy prices, and carbon intensities.
arXiv Detail & Related papers (2023-04-17T02:12:30Z)
CILP: Co-simulation based Imitation Learner for Dynamic Resource Provisioning in Cloud Computing Environments [13.864161788250856]
Key challenge for latency-critical tasks is to predict future workload demands to provision proactively. Existing AI-based solutions tend to not holistically consider all crucial aspects such as provision overheads, heterogeneous VM costs and Quality of Service (QoS) of the cloud system. We propose a novel method, called CILP, that formulates the VM provisioning problem as two sub-problems of prediction and optimization.
arXiv Detail & Related papers (2023-02-11T09:15:34Z)
ANDREAS: Artificial intelligence traiNing scheDuler foR accElerAted resource clusterS [1.798617052102518]
We propose ANDREAS, an advanced scheduling solution to maximize performance and minimize Data Centers operational costs. experiments show that we can achieve a cost reduction between 30 and 62% on average with respect to first-principle methods.
arXiv Detail & Related papers (2021-05-11T14:36:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.