An online learning approach to dynamic pricing and capacity sizing in
service systems
- URL: http://arxiv.org/abs/2009.02911v3
- Date: Wed, 7 Sep 2022 08:40:50 GMT
- Title: An online learning approach to dynamic pricing and capacity sizing in
service systems
- Authors: Xinyun Chen, Yunan Liu and Guiyu Hong
- Abstract summary: We study a dynamic pricing and capacity sizing problem in a $GI/GI/1$ queue.
Our framework is dubbed Gradient-based Online Learning in Queue (GOLiQ)
- Score: 26.720986177499338
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study a dynamic pricing and capacity sizing problem in a $GI/GI/1$ queue,
where the service provider's objective is to obtain the optimal service fee $p$
and service capacity $\mu$ so as to maximize the cumulative expected profit
(the service revenue minus the staffing cost and delay penalty). Due to the
complex nature of the queueing dynamics, such a problem has no analytic
solution so that previous research often resorts to heavy-traffic analysis
where both the arrival rate and service rate are sent to infinity. In this work
we propose an online learning framework designed for solving this problem which
does not require the system's scale to increase. Our framework is dubbed
Gradient-based Online Learning in Queue (GOLiQ). GOLiQ organizes the time
horizon into successive operational cycles and prescribes an efficient
procedure to obtain improved pricing and staffing policies in each cycle using
data collected in previous cycles. Data here include the number of customer
arrivals, waiting times, and the server's busy times. The ingenuity of this
approach lies in its online nature, which allows the service provider do better
by interacting with the environment. Effectiveness of GOLiQ is substantiated by
(i) theoretical results including the algorithm convergence and regret analysis
(with a logarithmic regret bound), and (ii) engineering confirmation via
simulation experiments of a variety of representative $GI/GI/1$ queues.
Related papers
- Dynamic Matching with Post-allocation Service and its Application to Refugee Resettlement [1.9689888982532262]
Motivated by our collaboration with a major refugee resettlement agency in the U.S., we study a dynamic matching problem where each new arrival (a refugee case) must be matched immediately and irrevocably to one of the static resources (a location with a fixed annual quota)
Given the time-consuming nature of service, a server may not be available at a given time, thus we refer to it as a dynamic resource. Upon matching, the case will wait to avail service in a first-come-first-serve manner.
arXiv Detail & Related papers (2024-10-30T13:17:38Z) - Online Optimization for Network Resource Allocation and Comparison with
Reinforcement Learning Techniques [0.6466206145151128]
We tackle in this paper an online network resource allocation problem with job transfers.
We propose a randomized online algorithm based on the exponentially weighted method.
We prove that our algorithm enjoys a sub-linear in time regret, which indicates that the algorithm is adapting and learning from its experiences.
arXiv Detail & Related papers (2023-11-16T17:08:27Z) - Efficient Methods for Non-stationary Online Learning [67.3300478545554]
We present efficient methods for optimizing dynamic regret and adaptive regret, which reduce the number of projections per round from $mathcalO(log T)$ to $1$.
Our technique hinges on the reduction mechanism developed in parameter-free online learning and requires non-trivial twists on non-stationary online methods.
arXiv Detail & Related papers (2023-09-16T07:30:12Z) - MARLIN: Soft Actor-Critic based Reinforcement Learning for Congestion
Control in Real Networks [63.24965775030673]
We propose a novel Reinforcement Learning (RL) approach to design generic Congestion Control (CC) algorithms.
Our solution, MARLIN, uses the Soft Actor-Critic algorithm to maximize both entropy and return.
We trained MARLIN on a real network with varying background traffic patterns to overcome the sim-to-real mismatch.
arXiv Detail & Related papers (2023-02-02T18:27:20Z) - Approaching sales forecasting using recurrent neural networks and
transformers [57.43518732385863]
We develop three alternatives to tackle the problem of forecasting the customer sales at day/store/item level using deep learning techniques.
Our empirical results show how good performance can be achieved by using a simple sequence to sequence architecture with minimal data preprocessing effort.
The proposed solution achieves a RMSLE of around 0.54, which is competitive with other more specific solutions to the problem proposed in the Kaggle competition.
arXiv Detail & Related papers (2022-04-16T12:03:52Z) - Online Caching with Optimistic Learning [15.877673959068458]
This paper proposes a new algorithmic toolbox for tackling this problem through the lens of optimistic online learning.
We design online caching algorithms for bipartite networks with fixed-size caches or elastic leased caches subject to time-average budget constraints.
We prove that the proposed optimistic learning caching policies can achieve sub-zero performance loss (regret) for perfect predictions, and maintain the best achievable regret bound $O(sqrt T)$ even for arbitrary-bad predictions.
arXiv Detail & Related papers (2022-02-22T00:04:30Z) - Scheduling Servers with Stochastic Bilinear Rewards [7.519872646378837]
A system optimization problem arises in multi-class, multi-server queueing system scheduling.
We propose a scheduling algorithm based on weighted proportional fair allocation criteria augmented with marginal costs for reward.
Our algorithm sub-linear regret and sublinear mean holding cost (and queue length bound) with respect to the time horizon, thus guaranteeing queueing system stability.
arXiv Detail & Related papers (2021-12-13T00:37:20Z) - A Deep Value-network Based Approach for Multi-Driver Order Dispatching [55.36656442934531]
We propose a deep reinforcement learning based solution for order dispatching.
We conduct large scale online A/B tests on DiDi's ride-dispatching platform.
Results show that CVNet consistently outperforms other recently proposed dispatching methods.
arXiv Detail & Related papers (2021-06-08T16:27:04Z) - Better than the Best: Gradient-based Improper Reinforcement Learning for
Network Scheduling [60.48359567964899]
We consider the problem of scheduling in constrained queueing networks with a view to minimizing packet delay.
We use a policy gradient based reinforcement learning algorithm that produces a scheduler that performs better than the available atomic policies.
arXiv Detail & Related papers (2021-05-01T10:18:34Z) - Learning Augmented Index Policy for Optimal Service Placement at the
Network Edge [8.136957953239254]
We consider the problem of service placement at the network edge, in which a decision maker has to choose between $N$ services to host at the edge.
Our goal is to design adaptive algorithms to minimize the average service delivery latency for customers.
arXiv Detail & Related papers (2021-01-10T23:54:59Z) - Subset Sampling For Progressive Neural Network Learning [106.12874293597754]
Progressive Neural Network Learning is a class of algorithms that incrementally construct the network's topology and optimize its parameters based on the training data.
We propose to speed up this process by exploiting subsets of training data at each incremental training step.
Experimental results in object, scene and face recognition problems demonstrate that the proposed approach speeds up the optimization procedure considerably.
arXiv Detail & Related papers (2020-02-17T18:57:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.