Solving the Pod Repositioning Problem with Deep Reinforced Adaptive Large Neighborhood Search
- URL: http://arxiv.org/abs/2506.02746v1
- Date: Tue, 03 Jun 2025 11:07:41 GMT
- Title: Solving the Pod Repositioning Problem with Deep Reinforced Adaptive Large Neighborhood Search
- Authors: Lin Xie, Hanyi Li,
- Abstract summary: The Pod Repositioning Problem (PRP) in Robotic Mobile Fulfillment Systems involves selecting optimal storage locations for pods returning from pick stations.<n>This work presents an improved solution method that integrates Large Neighborhood Search (ALNS) with Deep Reinforcement Learning (DRL)<n>A DRL agent dynamically selects destroy and repair operators and adjusts key parameters such as destruction degree and acceptance thresholds during the search.
- Score: 0.3867363075280544
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Pod Repositioning Problem (PRP) in Robotic Mobile Fulfillment Systems (RMFS) involves selecting optimal storage locations for pods returning from pick stations. This work presents an improved solution method that integrates Adaptive Large Neighborhood Search (ALNS) with Deep Reinforcement Learning (DRL). A DRL agent dynamically selects destroy and repair operators and adjusts key parameters such as destruction degree and acceptance thresholds during the search. Specialized heuristics for both operators are designed to reflect PRP-specific characteristics, including pod usage frequency and movement costs. Computational results show that this DRL-guided ALNS outperforms traditional approaches such as cheapest-place, fixed-place, binary integer programming, and static heuristics. The method demonstrates strong solution quality and illustrating the benefit of learning-driven control within combinatorial optimization for warehouse systems.
Related papers
- Preference Optimization for Combinatorial Optimization Problems [54.87466279363487]
Reinforcement Learning (RL) has emerged as a powerful tool for neural optimization, enabling models learns that solve complex problems without requiring expert knowledge.<n>Despite significant progress, existing RL approaches face challenges such as diminishing reward signals and inefficient exploration in vast action spaces.<n>We propose Preference Optimization, a novel method that transforms quantitative reward signals into qualitative preference signals via statistical comparison modeling.
arXiv Detail & Related papers (2025-05-13T16:47:00Z) - Generative Diffusion Models for Resource Allocation in Wireless Networks [77.36145730415045]
We train a policy to imitate an expert and generate new samples from the optimal distribution.<n>We achieve near-optimal performance through the sequential execution of the generated samples.<n>We present numerical results in a case study of power control.
arXiv Detail & Related papers (2025-04-28T21:44:31Z) - Deep Reinforcement Learning for Dynamic Order Picking in Warehouse Operations [0.6116681488656472]
This study addresses the dynamic order picking problem, a significant concern in modern warehouse management.<n>We propose a Deep Reinforcement Learning framework tailored for single-block warehouses equipped with an autonomous picking device.<n>By dynamically optimizing picker routes, our approach significantly reduces order throughput times and unfulfilled orders.
arXiv Detail & Related papers (2024-08-03T03:56:46Z) - Reinforcement learning for anisotropic p-adaptation and error estimation in high-order solvers [0.37109226820205005]
We present a novel approach to automate and optimize anisotropic p-adaptation in high-order h/p using Reinforcement Learning (RL)
We develop an offline training approach, decoupled from the main solver, which shows minimal overcost when performing simulations.
We derive an inexpensive RL-based error estimation approach that enables the quantification of local discretization errors.
arXiv Detail & Related papers (2024-07-26T17:55:23Z) - Multiobjective Vehicle Routing Optimization with Time Windows: A Hybrid Approach Using Deep Reinforcement Learning and NSGA-II [52.083337333478674]
This paper proposes a weight-aware deep reinforcement learning (WADRL) approach designed to address the multiobjective vehicle routing problem with time windows (MOVRPTW)
The Non-dominated sorting genetic algorithm-II (NSGA-II) method is then employed to optimize the outcomes produced by the WADRL.
arXiv Detail & Related papers (2024-07-18T02:46:06Z) - Efficiently Training Deep-Learning Parametric Policies using Lagrangian Duality [55.06411438416805]
Constrained Markov Decision Processes (CMDPs) are critical in many high-stakes applications.<n>This paper introduces a novel approach, Two-Stage Deep Decision Rules (TS- DDR) to efficiently train parametric actor policies.<n>It is shown to enhance solution quality and to reduce computation times by several orders of magnitude when compared to current state-of-the-art methods.
arXiv Detail & Related papers (2024-05-23T18:19:47Z) - Variational Autoencoders for exteroceptive perception in reinforcement learning-based collision avoidance [0.0]
Deep Reinforcement Learning (DRL) has emerged as a promising control framework.
Current DRL algorithms require disproportionally large computational resources to find near-optimal policies.
This paper presents a comprehensive exploration of our proposed approach in maritime control systems.
arXiv Detail & Related papers (2024-03-31T09:25:28Z) - REBEL: Reward Regularization-Based Approach for Robotic Reinforcement Learning from Human Feedback [61.54791065013767]
A misalignment between the reward function and human preferences can lead to catastrophic outcomes in the real world.<n>Recent methods aim to mitigate misalignment by learning reward functions from human preferences.<n>We propose a novel concept of reward regularization within the robotic RLHF framework.
arXiv Detail & Related papers (2023-12-22T04:56:37Z) - Online Control of Adaptive Large Neighborhood Search using Deep Reinforcement Learning [4.374837991804085]
We introduce a Deep Reinforcement Learning based approach called DR-ALNS that selects operators, adjusts parameters, and controls the acceptance criterion throughout the search.
We evaluate the proposed method on a problem with orienteering weights and time windows, as presented in an IJCAI competition.
The results show that our approach outperforms vanilla ALNS, ALNS tuned with Bayesian optimization, and two state-of-the-art DRL approaches.
arXiv Detail & Related papers (2022-11-01T21:33:46Z) - Distributed Multi-agent Meta Learning for Trajectory Design in Wireless
Drone Networks [151.27147513363502]
This paper studies the problem of the trajectory design for a group of energyconstrained drones operating in dynamic wireless network environments.
A value based reinforcement learning (VDRL) solution and a metatraining mechanism is proposed.
arXiv Detail & Related papers (2020-12-06T01:30:12Z) - Robust Optimal Transport with Applications in Generative Modeling and
Domain Adaptation [120.69747175899421]
Optimal Transport (OT) distances such as Wasserstein have been used in several areas such as GANs and domain adaptation.
We propose a computationally-efficient dual form of the robust OT optimization that is amenable to modern deep learning applications.
Our approach can train state-of-the-art GAN models on noisy datasets corrupted with outlier distributions.
arXiv Detail & Related papers (2020-10-12T17:13:40Z) - Stacked Auto Encoder Based Deep Reinforcement Learning for Online
Resource Scheduling in Large-Scale MEC Networks [44.40722828581203]
An online resource scheduling framework is proposed for minimizing the sum of weighted task latency for all the Internet of things (IoT) users.
A deep reinforcement learning (DRL) based solution is proposed, which includes the following components.
A preserved and prioritized experience replay (2p-ER) is introduced to assist the DRL to train the policy network and find the optimal offloading policy.
arXiv Detail & Related papers (2020-01-24T23:01:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.