Deep Controlled Learning for Inventory Control
- URL: http://arxiv.org/abs/2011.15122v5
- Date: Mon, 25 Sep 2023 08:06:08 GMT
- Title: Deep Controlled Learning for Inventory Control
- Authors: Tarkan Temiz\"oz, Christina Imdahl, Remco Dijkman, Douniel
Lamghari-Idrissi, Willem van Jaarsveld
- Abstract summary: Controlled Deep Learning (DCL) is a new DRL framework based on approximate policy specifically designed to tackle inventory problems.
DCL outperforms existing state-of-the-art iterations in lost sales inventory control, perishable inventory systems, and inventory systems with random lead times.
These substantial performance and robustness improvements pave the way for the effective application of tailored DRL algorithms to inventory management problems.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Problem Definition: Are traditional deep reinforcement learning (DRL)
algorithms, developed for a broad range of purposes including game-play and
robotics, the most suitable machine learning algorithms for applications in
inventory control? To what extent would DRL algorithms tailored to the unique
characteristics of inventory control problems provide superior performance
compared to DRL and traditional benchmarks? Methodology/results: We propose and
study Deep Controlled Learning (DCL), a new DRL framework based on approximate
policy iteration specifically designed to tackle inventory problems.
Comparative evaluations reveal that DCL outperforms existing state-of-the-art
heuristics in lost sales inventory control, perishable inventory systems, and
inventory systems with random lead times, achieving lower average costs across
all test instances and maintaining an optimality gap of no more than 0.1\%.
Notably, the same hyperparameter set is utilized across all experiments,
underscoring the robustness and generalizability of the proposed method.
Managerial implications: These substantial performance and robustness
improvements pave the way for the effective application of tailored DRL
algorithms to inventory management problems, empowering decision-makers to
optimize stock levels, minimize costs, and enhance responsiveness across
various industries.
Related papers
- Comparison of Model Predictive Control and Proximal Policy Optimization for a 1-DOF Helicopter System [0.7499722271664147]
This study conducts a comparative analysis of Model Predictive Control (MPC) and Proximal Policy Optimization (PPO), a Deep Reinforcement Learning (DRL) algorithm, applied to a Quanser Aero 2 system.
PPO excels in rise-time and adaptability, making it a promising approach for applications requiring rapid response and adaptability.
arXiv Detail & Related papers (2024-08-28T08:35:34Z) - How Can LLM Guide RL? A Value-Based Approach [68.55316627400683]
Reinforcement learning (RL) has become the de facto standard practice for sequential decision-making problems by improving future acting policies with feedback.
Recent developments in large language models (LLMs) have showcased impressive capabilities in language understanding and generation, yet they fall short in exploration and self-improvement capabilities.
We develop an algorithm named LINVIT that incorporates LLM guidance as a regularization factor in value-based RL, leading to significant reductions in the amount of data needed for learning.
arXiv Detail & Related papers (2024-02-25T20:07:13Z) - Deep reinforcement learning for machine scheduling: Methodology, the
state-of-the-art, and future directions [2.4541568670428915]
Machine scheduling aims to optimize job assignments to machines while adhering to manufacturing rules and job specifications.
Deep Reinforcement Learning (DRL), a key component of artificial general intelligence, has shown promise in various domains like gaming and robotics.
This paper offers a comprehensive review and comparison of DRL-based approaches, highlighting their methodology, applications, advantages, and limitations.
arXiv Detail & Related papers (2023-10-04T22:45:09Z) - Efficient Deep Reinforcement Learning Requires Regulating Overfitting [91.88004732618381]
We show that high temporal-difference (TD) error on the validation set of transitions is the main culprit that severely affects the performance of deep RL algorithms.
We show that a simple online model selection method that targets the validation TD error is effective across state-based DMC and Gym tasks.
arXiv Detail & Related papers (2023-04-20T17:11:05Z) - Benchmarking Actor-Critic Deep Reinforcement Learning Algorithms for
Robotics Control with Action Constraints [9.293472255463454]
This study presents a benchmark for evaluating action-constrained reinforcement learning (RL) algorithms.
We evaluate existing algorithms and their novel variants across multiple robotics control environments.
arXiv Detail & Related papers (2023-04-18T05:45:09Z) - Train Hard, Fight Easy: Robust Meta Reinforcement Learning [78.16589993684698]
A major challenge of reinforcement learning (RL) in real-world applications is the variation between environments, tasks or clients.
Standard MRL methods optimize the average return over tasks, but often suffer from poor results in tasks of high risk or difficulty.
In this work, we define a robust MRL objective with a controlled level.
The data inefficiency is addressed via the novel Robust Meta RL algorithm (RoML)
arXiv Detail & Related papers (2023-01-26T14:54:39Z) - A Transferable and Automatic Tuning of Deep Reinforcement Learning for
Cost Effective Phishing Detection [21.481974148873807]
Many challenging real-world problems require the deployment of ensembles multiple complementary learning models.
Deep Reinforcement Learning (DRL) offers a cost-effective alternative, where detectors are dynamically chosen based on the output of their predecessors.
arXiv Detail & Related papers (2022-09-19T14:09:07Z) - Towards Deployment-Efficient Reinforcement Learning: Lower Bound and
Optimality [141.89413461337324]
Deployment efficiency is an important criterion for many real-world applications of reinforcement learning (RL)
We propose a theoretical formulation for deployment-efficient RL (DE-RL) from an "optimization with constraints" perspective.
arXiv Detail & Related papers (2022-02-14T01:31:46Z) - Math Programming based Reinforcement Learning for Multi-Echelon
Inventory Management [1.9161790404101895]
Reinforcement learning has lead to considerable break-throughs in diverse areas such as robotics, games and many others.
But the application to RL in complex real-world decision making problems remains limited.
These characteristics make the problem considerably harder to solve for existing RL methods that rely on enumeration techniques to solve per step action problems.
We show that a properly selected discretization of the underlying uncertain distribution can yield near optimal actor policy even with very few samples from the underlying uncertainty.
We find that PARL outperforms commonly used base stock by 44.7% and the best performing RL method by up to 12.1% on average
arXiv Detail & Related papers (2021-12-04T01:40:34Z) - Critic Regularized Regression [70.8487887738354]
We propose a novel offline RL algorithm to learn policies from data using a form of critic-regularized regression (CRR)
We find that CRR performs surprisingly well and scales to tasks with high-dimensional state and action spaces.
arXiv Detail & Related papers (2020-06-26T17:50:26Z) - Guided Constrained Policy Optimization for Dynamic Quadrupedal Robot
Locomotion [78.46388769788405]
We introduce guided constrained policy optimization (GCPO), an RL framework based upon our implementation of constrained policy optimization (CPPO)
We show that guided constrained RL offers faster convergence close to the desired optimum resulting in an optimal, yet physically feasible, robotic control behavior without the need for precise reward function tuning.
arXiv Detail & Related papers (2020-02-22T10:15:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.