Cooperative Multi-Agent Reinforcement Learning for Inventory Management
- URL: http://arxiv.org/abs/2304.08769v1
- Date: Tue, 18 Apr 2023 06:55:59 GMT
- Title: Cooperative Multi-Agent Reinforcement Learning for Inventory Management
- Authors: Madhav Khirwar, Karthik S. Gurumoorthy, Ankit Ajit Jain, Shantala
Manchenahally
- Abstract summary: Reinforcement Learning (RL) for inventory management is a nascent field of research.
We present a system with a custom GPU-parallelized environment that consists of one warehouse and multiple stores.
We achieve a system that outperforms standard inventory control policies.
- Score: 0.5276232626689566
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: With Reinforcement Learning (RL) for inventory management (IM) being a
nascent field of research, approaches tend to be limited to simple, linear
environments with implementations that are minor modifications of off-the-shelf
RL algorithms. Scaling these simplistic environments to a real-world supply
chain comes with a few challenges such as: minimizing the computational
requirements of the environment, specifying agent configurations that are
representative of dynamics at real world stores and warehouses, and specifying
a reward framework that encourages desirable behavior across the whole supply
chain. In this work, we present a system with a custom GPU-parallelized
environment that consists of one warehouse and multiple stores, a novel
architecture for agent-environment dynamics incorporating enhanced state and
action spaces, and a shared reward specification that seeks to optimize for a
large retailer's supply chain needs. Each vertex in the supply chain graph is
an independent agent that, based on its own inventory, able to place
replenishment orders to the vertex upstream. The warehouse agent, aside from
placing orders from the supplier, has the special property of also being able
to constrain replenishment to stores downstream, which results in it learning
an additional allocation sub-policy. We achieve a system that outperforms
standard inventory control policies such as a base-stock policy and other
RL-based specifications for 1 product, and lay out a future direction of work
for multiple products.
Related papers
- ComaDICE: Offline Cooperative Multi-Agent Reinforcement Learning with Stationary Distribution Shift Regularization [11.620274237352026]
offline reinforcement learning (RL) has garnered significant attention for its ability to learn effective policies from pre-collected datasets.
MARL presents additional challenges due to the large joint state-action space and the complexity of multi-agent behaviors.
We introduce a regularizer in the space of stationary distributions to better handle distributional shift.
arXiv Detail & Related papers (2024-10-02T18:56:10Z) - Enhancing Supply Chain Visibility with Knowledge Graphs and Large Language Models [49.898152180805454]
This paper presents a novel framework leveraging Knowledge Graphs (KGs) and Large Language Models (LLMs) to enhance supply chain visibility.
Our zero-shot, LLM-driven approach automates the extraction of supply chain information from diverse public sources.
With high accuracy in NER and RE tasks, it provides an effective tool for understanding complex, multi-tiered supply networks.
arXiv Detail & Related papers (2024-08-05T17:11:29Z) - REBEL: Reinforcement Learning via Regressing Relative Rewards [59.68420022466047]
We propose REBEL, a minimalist RL algorithm for the era of generative models.
In theory, we prove that fundamental RL algorithms like Natural Policy Gradient can be seen as variants of REBEL.
We find that REBEL provides a unified approach to language modeling and image generation with stronger or similar performance as PPO and DPO.
arXiv Detail & Related papers (2024-04-25T17:20:45Z) - MARLIM: Multi-Agent Reinforcement Learning for Inventory Management [1.1470070927586016]
This paper presents a novel reinforcement learning framework called MARLIM to address the inventory management problem.
Within this context, controllers are developed through single or multiple agents in a cooperative setting.
Numerical experiments on real data demonstrate the benefits of reinforcement learning methods over traditional baselines.
arXiv Detail & Related papers (2023-08-03T09:31:45Z) - Neural Inventory Control in Networks via Hindsight Differentiable Policy Optimization [5.590976834881065]
We argue that inventory management presents unique opportunities for reliably applying and evaluating deep reinforcement learning (DRL) algorithms.
The first is Hindsight Differentiable Policy Optimization (HDPO), which performs gradient descent to optimize policy performance.
The second technique involves aligning policy (neural) network structures with the structure of the inventory network.
arXiv Detail & Related papers (2023-06-20T02:58:25Z) - No-Regret Learning in Two-Echelon Supply Chain with Unknown Demand
Distribution [48.27759561064771]
We consider the two-echelon supply chain model introduced in [Cachon and Zipkin, 1999] under two different settings.
We design algorithms that achieve favorable guarantees for both regret and convergence to the optimal inventory decision in both settings.
Our algorithms are based on Online Gradient Descent and Online Newton Step, together with several new ingredients specifically designed for our problem.
arXiv Detail & Related papers (2022-10-23T08:45:39Z) - Concepts and Algorithms for Agent-based Decentralized and Integrated
Scheduling of Production and Auxiliary Processes [78.120734120667]
This paper describes an agent-based decentralized and integrated scheduling approach.
Part of the requirements is to develop a linearly scaling communication architecture.
The approach is explained using an example based on industrial requirements.
arXiv Detail & Related papers (2022-05-06T18:44:29Z) - Control of Dual-Sourcing Inventory Systems using Recurrent Neural
Networks [0.0]
We show that proposed neural network controllers (NNCs) are able to learn near-optimal policies of commonly used instances within a few minutes of CPU time.
Our research opens up new ways of efficiently managing complex, high-dimensional inventory dynamics.
arXiv Detail & Related papers (2022-01-16T19:44:06Z) - Creating Training Sets via Weak Indirect Supervision [66.77795318313372]
Weak Supervision (WS) frameworks synthesize training labels from multiple potentially noisy supervision sources.
We formulate Weak Indirect Supervision (WIS), a new research problem for automatically synthesizing training labels.
We develop a probabilistic modeling approach, PLRM, which uses user-provided label relations to model and leverage indirect supervision sources.
arXiv Detail & Related papers (2021-10-07T14:09:35Z) - Will bots take over the supply chain? Revisiting Agent-based supply
chain automation [71.77396882936951]
Agent-based supply chains have been proposed since early 2000; industrial uptake has been lagging.
We find that agent-based technology has matured, and other supporting technologies that are penetrating supply chains are filling in gaps.
For example, the ubiquity of IoT technology helps agents "sense" the state of affairs in a supply chain and opens up new possibilities for automation.
arXiv Detail & Related papers (2021-09-03T18:44:26Z) - Reinforcement Learning for Multi-Product Multi-Node Inventory Management
in Supply Chains [17.260459603456745]
This paper describes the application of reinforcement learning (RL) to multi-product inventory management in supply chains.
Experiments show that the proposed approach is able to handle a multi-objective reward comprised of maximising product sales and minimising wastage of perishable products.
arXiv Detail & Related papers (2020-06-07T04:02:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.