Multi-Agent Reinforcement Learning with Shared Resources for Inventory
Management
- URL: http://arxiv.org/abs/2212.07684v2
- Date: Sun, 18 Dec 2022 03:02:47 GMT
- Title: Multi-Agent Reinforcement Learning with Shared Resources for Inventory
Management
- Authors: Yuandong Ding, Mingxiao Feng, Guozi Liu, Wei Jiang, Chuheng Zhang, Li
Zhao, Lei Song, Houqiang Li, Yan Jin, Jiang Bian
- Abstract summary: In our setting, the constraint on the shared resources (such as the inventory capacity) couples the otherwise independent control for each SKU.
We formulate the problem with this structure as Shared-Resource Game (SRSG)and propose an efficient algorithm called Context-aware Decentralized PPO (CD-PPO)
Through extensive experiments, we demonstrate that CD-PPO can accelerate the learning procedure compared with standard MARL algorithms.
- Score: 62.23979094308932
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we consider the inventory management (IM) problem where we
need to make replenishment decisions for a large number of stock keeping units
(SKUs) to balance their supply and demand. In our setting, the constraint on
the shared resources (such as the inventory capacity) couples the otherwise
independent control for each SKU. We formulate the problem with this structure
as Shared-Resource Stochastic Game (SRSG)and propose an efficient algorithm
called Context-aware Decentralized PPO (CD-PPO). Through extensive experiments,
we demonstrate that CD-PPO can accelerate the learning procedure compared with
standard MARL algorithms.
Related papers
- InvAgent: A Large Language Model based Multi-Agent System for Inventory Management in Supply Chains [0.0]
This study introduces a novel approach using large language models (LLMs) to manage multi-agent inventory systems.
Our model, InvAgent, enhances resilience and improves efficiency across the supply chain network.
arXiv Detail & Related papers (2024-07-16T04:55:17Z) - A Distributional Analogue to the Successor Representation [54.99439648059807]
This paper contributes a new approach for distributional reinforcement learning.
It elucidates a clean separation of transition structure and reward in the learning process.
As an illustration, we show that it enables zero-shot risk-sensitive policy evaluation.
arXiv Detail & Related papers (2024-02-13T15:35:24Z) - Decentralised Q-Learning for Multi-Agent Markov Decision Processes with
a Satisfiability Criterion [0.0]
We propose a reinforcement learning algorithm to solve a multi-agent Markov decision process (MMDP)
The goal is to lower the time average cost of each agent to below a pre-specified agent-specific bound.
arXiv Detail & Related papers (2023-11-21T13:56:44Z) - Provable Benefits of Multi-task RL under Non-Markovian Decision Making
Processes [56.714690083118406]
In multi-task reinforcement learning (RL) under Markov decision processes (MDPs), the presence of shared latent structures has been shown to yield significant benefits to the sample efficiency compared to single-task RL.
We investigate whether such a benefit can extend to more general sequential decision making problems, such as partially observable MDPs (POMDPs) and more general predictive state representations (PSRs)
We propose a provably efficient algorithm UMT-PSR for finding near-optimal policies for all PSRs, and demonstrate that the advantage of multi-task learning manifests if the joint model class of PSR
arXiv Detail & Related papers (2023-10-20T14:50:28Z) - MARLIM: Multi-Agent Reinforcement Learning for Inventory Management [1.1470070927586016]
This paper presents a novel reinforcement learning framework called MARLIM to address the inventory management problem.
Within this context, controllers are developed through single or multiple agents in a cooperative setting.
Numerical experiments on real data demonstrate the benefits of reinforcement learning methods over traditional baselines.
arXiv Detail & Related papers (2023-08-03T09:31:45Z) - Learning Reward Machines in Cooperative Multi-Agent Tasks [75.79805204646428]
This paper presents a novel approach to Multi-Agent Reinforcement Learning (MARL)
It combines cooperative task decomposition with the learning of reward machines (RMs) encoding the structure of the sub-tasks.
The proposed method helps deal with the non-Markovian nature of the rewards in partially observable environments.
arXiv Detail & Related papers (2023-03-24T15:12:28Z) - Comparing Deep Reinforcement Learning Algorithms in Two-Echelon Supply
Chains [1.4685355149711299]
We analyze and compare the performance of state-of-the-art deep reinforcement learning algorithms for solving the supply chain inventory management problem.
This study provides detailed insight into the design and development of an open-source software library that provides a customizable environment for solving the supply chain inventory management problem.
arXiv Detail & Related papers (2022-04-20T16:33:01Z) - Controllable Summarization with Constrained Markov Decision Process [50.04321779376415]
We study controllable text summarization which allows users to gain control on a particular attribute.
We propose a novel training framework based on Constrained Markov Decision Process (CMDP)
Our framework can be applied to control important attributes of summarization, including length, covered entities, and abstractiveness.
arXiv Detail & Related papers (2021-08-07T09:12:53Z) - Is Independent Learning All You Need in the StarCraft Multi-Agent
Challenge? [100.48692829396778]
Independent PPO (IPPO) is a form of independent learning in which each agent simply estimates its local value function.
IPPO's strong performance may be due to its robustness to some forms of environment non-stationarity.
arXiv Detail & Related papers (2020-11-18T20:29:59Z) - Reinforcement Learning for Multi-Product Multi-Node Inventory Management
in Supply Chains [17.260459603456745]
This paper describes the application of reinforcement learning (RL) to multi-product inventory management in supply chains.
Experiments show that the proposed approach is able to handle a multi-objective reward comprised of maximising product sales and minimising wastage of perishable products.
arXiv Detail & Related papers (2020-06-07T04:02:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.