Online Optimization for Network Resource Allocation and Comparison with
Reinforcement Learning Techniques
- URL: http://arxiv.org/abs/2311.10023v1
- Date: Thu, 16 Nov 2023 17:08:27 GMT
- Title: Online Optimization for Network Resource Allocation and Comparison with
Reinforcement Learning Techniques
- Authors: Ahmed Sid-Ali, Ioannis Lambadaris, Yiqiang Q. Zhao, Gennady Shaikhet,
and Amirhossein Asgharnia
- Abstract summary: We tackle in this paper an online network resource allocation problem with job transfers.
We propose a randomized online algorithm based on the exponentially weighted method.
We prove that our algorithm enjoys a sub-linear in time regret, which indicates that the algorithm is adapting and learning from its experiences.
- Score: 0.6466206145151128
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We tackle in this paper an online network resource allocation problem with
job transfers. The network is composed of many servers connected by
communication links. The system operates in discrete time; at each time slot,
the administrator reserves resources at servers for future job requests, and a
cost is incurred for the reservations made. Then, after receptions, the jobs
may be transferred between the servers to best accommodate the demands. This
incurs an additional transport cost. Finally, if a job request cannot be
satisfied, there is a violation that engenders a cost to pay for the blocked
job. We propose a randomized online algorithm based on the exponentially
weighted method. We prove that our algorithm enjoys a sub-linear in time
regret, which indicates that the algorithm is adapting and learning from its
experiences and is becoming more efficient in its decision-making as it
accumulates more data. Moreover, we test the performance of our algorithm on
artificial data and compare it against a reinforcement learning method where we
show that our proposed method outperforms the latter.
Related papers
- Contractual Reinforcement Learning: Pulling Arms with Invisible Hands [68.77645200579181]
We propose a theoretical framework for aligning economic interests of different stakeholders in the online learning problems through contract design.
For the planning problem, we design an efficient dynamic programming algorithm to determine the optimal contracts against the far-sighted agent.
For the learning problem, we introduce a generic design of no-regret learning algorithms to untangle the challenges from robust design of contracts to the balance of exploration and exploitation.
arXiv Detail & Related papers (2024-07-01T16:53:00Z) - Online Optimization for Randomized Network Resource Allocation with Long-Term Constraints [0.610240618821149]
We study an optimal online resource reservation problem in a simple communication network.
We propose an online saddle-point algorithm for which we present an upper bound for the associated K-benchmark regret.
arXiv Detail & Related papers (2023-05-24T20:47:17Z) - Online Resource Allocation: Bandits feedback and Advice on Time-varying
Demands [12.081877372552606]
We consider a general online resource allocation model with bandit feedback and time-varying demands.
Motivated by the recent Online Algorithms with Advice framework, we explore how online advice can inform policy design.
Our proposed algorithm is shown to have both theoretical performance and promising numerical results compared with other algorithms in literature.
arXiv Detail & Related papers (2023-02-08T16:40:43Z) - MARLIN: Soft Actor-Critic based Reinforcement Learning for Congestion
Control in Real Networks [63.24965775030673]
We propose a novel Reinforcement Learning (RL) approach to design generic Congestion Control (CC) algorithms.
Our solution, MARLIN, uses the Soft Actor-Critic algorithm to maximize both entropy and return.
We trained MARLIN on a real network with varying background traffic patterns to overcome the sim-to-real mismatch.
arXiv Detail & Related papers (2023-02-02T18:27:20Z) - Communication-Efficient Adam-Type Algorithms for Distributed Data Mining [93.50424502011626]
We propose a class of novel distributed Adam-type algorithms (emphi.e., SketchedAMSGrad) utilizing sketching.
Our new algorithm achieves a fast convergence rate of $O(frac1sqrtnT + frac1(k/d)2 T)$ with the communication cost of $O(k log(d))$ at each iteration.
arXiv Detail & Related papers (2022-10-14T01:42:05Z) - Learning While Scheduling in Multi-Server Systems with Unknown
Statistics: MaxWeight with Discounted UCB [18.898514227870926]
This paper considers a multi-server system with multiple servers and multiple types of jobs, where different job types require different amounts of processing time at different servers.
The goal is to schedule jobs on servers without knowing the statistics of the processing times.
We propose a new algorithm, which combines the MaxWeight scheduling policy with discounted upper confidence bound (UCB) to simultaneously learn statistics and schedule jobs to servers.
arXiv Detail & Related papers (2022-09-02T15:37:02Z) - Scheduling Servers with Stochastic Bilinear Rewards [7.519872646378837]
A system optimization problem arises in multi-class, multi-server queueing system scheduling.
We propose a scheduling algorithm based on weighted proportional fair allocation criteria augmented with marginal costs for reward.
Our algorithm sub-linear regret and sublinear mean holding cost (and queue length bound) with respect to the time horizon, thus guaranteeing queueing system stability.
arXiv Detail & Related papers (2021-12-13T00:37:20Z) - A Deep Value-network Based Approach for Multi-Driver Order Dispatching [55.36656442934531]
We propose a deep reinforcement learning based solution for order dispatching.
We conduct large scale online A/B tests on DiDi's ride-dispatching platform.
Results show that CVNet consistently outperforms other recently proposed dispatching methods.
arXiv Detail & Related papers (2021-06-08T16:27:04Z) - An Online Learning Approach to Optimizing Time-Varying Costs of AoI [26.661352924641285]
We consider systems that require timely monitoring of sources over a communication network.
For the single source monitoring problem, we design algorithms that achieve sublinear regret compared to the best fixed policy in hindsight.
For the multiple source scheduling problem, we design a new online learning algorithm called Follow-the-Perturbed-Whittle-Leader.
arXiv Detail & Related papers (2021-05-27T18:10:56Z) - Better than the Best: Gradient-based Improper Reinforcement Learning for
Network Scheduling [60.48359567964899]
We consider the problem of scheduling in constrained queueing networks with a view to minimizing packet delay.
We use a policy gradient based reinforcement learning algorithm that produces a scheduler that performs better than the available atomic policies.
arXiv Detail & Related papers (2021-05-01T10:18:34Z) - A Machine Learning Approach for Task and Resource Allocation in Mobile
Edge Computing Based Networks [108.57859531628264]
A joint task, spectrum, and transmit power allocation problem is investigated for a wireless network.
The proposed algorithm can reduce the number of iterations needed for convergence and the maximal delay among all users by up to 18% and 11.1% compared to the standard Q-learning algorithm.
arXiv Detail & Related papers (2020-07-20T13:46:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.