Efficient Resource Scheduling for Distributed Infrastructures Using
Negotiation Capabilities
- URL: http://arxiv.org/abs/2402.06938v2
- Date: Tue, 13 Feb 2024 15:58:40 GMT
- Title: Efficient Resource Scheduling for Distributed Infrastructures Using
Negotiation Capabilities
- Authors: Junjie Chu and Prashant Singh and Salman Toor
- Abstract summary: We propose an agent-based auto-negotiation system for resource scheduling based on fuzzy logic.
The proposed method can complete a one-to-one auto-negotiation process and generate optimal offers for the provider and client.
We successfully train machine learning models to replace the fuzzy negotiation system to improve processing speed.
- Score: 0.46040036610482665
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the past few decades, the rapid development of information and internet
technologies has spawned massive amounts of data and information. The
information explosion drives many enterprises or individuals to seek to rent
cloud computing infrastructure to put their applications in the cloud. However,
the agreements reached between cloud computing providers and clients are often
not efficient. Many factors affect the efficiency, such as the idleness of the
providers' cloud computing infrastructure, and the additional cost to the
clients. One possible solution is to introduce a comprehensive, bargaining game
(a type of negotiation), and schedule resources according to the negotiation
results. We propose an agent-based auto-negotiation system for resource
scheduling based on fuzzy logic. The proposed method can complete a one-to-one
auto-negotiation process and generate optimal offers for the provider and
client. We compare the impact of different member functions, fuzzy rule sets,
and negotiation scenario cases on the offers to optimize the system. It can be
concluded that our proposed method can utilize resources more efficiently and
is interpretable, highly flexible, and customizable. We successfully train
machine learning models to replace the fuzzy negotiation system to improve
processing speed. The article also highlights possible future improvements to
the proposed system and machine learning models. All the codes and data are
available in the open-source repository.
Related papers
- An Advanced Reinforcement Learning Framework for Online Scheduling of Deferrable Workloads in Cloud Computing [37.457951933256055]
We propose an online deferrable job scheduling method called textitOnline Scheduling for DEferrable jobs in Cloud (OSDEC), where a deep reinforcement learning model is adopted to learn the scheduling policy.
The proposed method can well plan the deployment schedule and achieve a short waiting time for users while maintaining a high resource utilization for the platform.
arXiv Detail & Related papers (2024-06-03T06:55:26Z) - How Can We Train Deep Learning Models Across Clouds and Continents? An Experimental Study [57.97785297481162]
We evaluate the cost and throughput implications of training in different zones, continents, and clouds for representative CV, NLP, and ASR models.
We show how leveraging spot pricing enables a new cost-efficient way to train models with multiple cheap instance, trumping both more centralized and powerful hardware and even on-demand cloud offerings at competitive prices.
arXiv Detail & Related papers (2023-06-05T18:17:37Z) - Deep Recurrent Learning Through Long Short Term Memory and TOPSIS [0.0]
Cloud computing's cheap, easy and quick management promise pushes business-owners for a transition from monolithic to a data-center/cloud based ERP.
Since cloud-ERP development involves a cyclic process, namely planning, implementing, testing and upgrading, its adoption is realized as a deep recurrent neural network problem.
Our theoretical model is validated over a reference model by articulating key players, services, architecture, functionalities.
arXiv Detail & Related papers (2022-12-30T10:35:25Z) - Outsourcing Training without Uploading Data via Efficient Collaborative
Open-Source Sampling [49.87637449243698]
Traditional outsourcing requires uploading device data to the cloud server.
We propose to leverage widely available open-source data, which is a massive dataset collected from public and heterogeneous sources.
We develop a novel strategy called Efficient Collaborative Open-source Sampling (ECOS) to construct a proximal proxy dataset from open-source data for cloud training.
arXiv Detail & Related papers (2022-10-23T00:12:18Z) - Concepts and Algorithms for Agent-based Decentralized and Integrated
Scheduling of Production and Auxiliary Processes [78.120734120667]
This paper describes an agent-based decentralized and integrated scheduling approach.
Part of the requirements is to develop a linearly scaling communication architecture.
The approach is explained using an example based on industrial requirements.
arXiv Detail & Related papers (2022-05-06T18:44:29Z) - Efficient Device Scheduling with Multi-Job Federated Learning [64.21733164243781]
We propose a novel multi-job Federated Learning framework to enable the parallel training process of multiple jobs.
We propose a reinforcement learning-based method and a Bayesian optimization-based method to schedule devices for multiple jobs while minimizing the cost.
Our proposed approaches significantly outperform baseline approaches in terms of training time (up to 8.67 times faster) and accuracy (up to 44.6% higher)
arXiv Detail & Related papers (2021-12-11T08:05:11Z) - The MIT Supercloud Dataset [3.375826083518709]
We introduce the MIT Supercloud dataset which aims to foster innovative AI/ML approaches to the analysis of large scale HPC and datacenter/cloud operations.
We provide detailed monitoring logs from the MIT Supercloud system, which include CPU and GPU usage by jobs, memory usage, file system logs, and physical monitoring data.
This paper discusses the details of the dataset, collection methodology, data availability, and discusses potential challenge problems being developed using this data.
arXiv Detail & Related papers (2021-08-04T13:06:17Z) - AI-based Resource Allocation: Reinforcement Learning for Adaptive
Auto-scaling in Serverless Environments [0.0]
Serverless computing has emerged as a compelling new paradigm of cloud computing models in recent years.
A common approach among both commercial and open source serverless computing platforms is workload-based auto-scaling.
In this paper we investigate the applicability of a reinforcement learning approach to request-based auto-scaling in a serverless framework.
arXiv Detail & Related papers (2020-05-29T06:18:39Z) - Local Differential Privacy based Federated Learning for Internet of
Things [72.83684013377433]
Internet of Vehicles (IoV) simulates a large variety of crowdsourcing applications such as Waze, Uber, and Amazon Mechanical Turk, etc.
Users of these applications report the real-time traffic information to the cloud server which trains a machine learning model based on traffic information reported by users for intelligent traffic management.
In this paper, we propose to integrate federated learning and local differential privacy (LDP) to facilitate the crowdsourcing applications to achieve the machine learning model.
arXiv Detail & Related papers (2020-04-19T14:03:10Z) - A Privacy-Preserving Distributed Architecture for
Deep-Learning-as-a-Service [68.84245063902908]
This paper introduces a novel distributed architecture for deep-learning-as-a-service.
It is able to preserve the user sensitive data while providing Cloud-based machine and deep learning services.
arXiv Detail & Related papers (2020-03-30T15:12:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.