One4Many-StablePacker: An Efficient Deep Reinforcement Learning Framework for the 3D Bin Packing Problem
- URL: http://arxiv.org/abs/2510.10057v1
- Date: Sat, 11 Oct 2025 06:47:49 GMT
- Title: One4Many-StablePacker: An Efficient Deep Reinforcement Learning Framework for the 3D Bin Packing Problem
- Authors: Lei Gao, Shihong Huang, Shengjie Wang, Hong Ma, Feng Zhang, Hengda Bao, Qichang Chen, Weihua Zhou,
- Abstract summary: Three-dimensional bin packing problem (3D-BPP) is widely applied in logistics and warehousing.<n>We propose a novel deep reinforcement learning framework, One4Many-StablePacker (O4M-SP)<n>O4M-SP is able to handle various bin dimensions in a single training process while incorporating support and weight constraints common in practice.
- Score: 12.516955835907089
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The three-dimensional bin packing problem (3D-BPP) is widely applied in logistics and warehousing. Existing learning-based approaches often neglect practical stability-related constraints and exhibit limitations in generalizing across diverse bin dimensions. To address these limitations, we propose a novel deep reinforcement learning framework, One4Many-StablePacker (O4M-SP). The primary advantage of O4M-SP is its ability to handle various bin dimensions in a single training process while incorporating support and weight constraints common in practice. Our training method introduces two innovative mechanisms. First, it employs a weighted reward function that integrates loading rate and a new height difference metric for packing layouts, promoting improved bin utilization through flatter packing configurations. Second, it combines clipped policy gradient optimization with a tailored policy drifting method to mitigate policy entropy collapse, encouraging exploration at critical decision nodes during packing to avoid suboptimal solutions. Extensive experiments demonstrate that O4M-SP generalizes successfully across diverse bin dimensions and significantly outperforms baseline methods. Furthermore, O4M-SP exhibits strong practical applicability by effectively addressing packing scenarios with stability constraints.
Related papers
- SlimPack: Fine-Grained Asymmetric Packing for Balanced and Efficient Variable-Length LLM Training [22.230495941666096]
We introduce SlimPack, a framework that fundamentally rethinks data packing and scheduling by decomposing samples into fine-grained slices.<n>SlimPack mitigates critical memory and communication bottlenecks by transforming large, volatile workloads into a stream of smaller, manageable units.<n>Asymmetric Partitioning assembles balanced scheduling units uniquely optimized for the different demands of the forward and backward passes.
arXiv Detail & Related papers (2025-09-30T13:37:48Z) - Discrete-Guided Diffusion for Scalable and Safe Multi-Robot Motion Planning [56.240199425429445]
Multi-Robot Motion Planning (MPMP) involves generating trajectories for multiple robots operating in a shared continuous workspace.<n>While discrete multi-agent finding (MAPF) methods are broadly adopted due to their scalability, their coarse discretization trajectory quality.<n>This paper tackles limitations of two approaches by introducing discrete MAPF solvers with constrained generative diffusion models.
arXiv Detail & Related papers (2025-08-27T17:59:36Z) - I$^3$-MRec: Invariant Learning with Information Bottleneck for Incomplete Modality Recommendation [56.55935146424585]
We introduce textbfI$3$-MRec, which learns with textbfInformation bottleneck principle for textbfIncomplete textbfModality textbfRecommendation.<n>By treating each modality as a distinct semantic environment, I$3$-MRec employs invariant risk minimization (IRM) to learn preference-oriented representations.<n>I$3$-MRec consistently outperforms existing state-of-the-art MRS methods across various modality-missing scenarios
arXiv Detail & Related papers (2025-08-06T09:29:50Z) - Deliberate Planning of 3D Bin Packing on Packing Configuration Trees [65.05353662124676]
Online 3D Bin Packing Problem (3D-BPP) has widespread applications in industrial automation.<n>We propose to enhance the practical applicability of online 3D-BPP via learning on a novel hierarchical representation, packing configuration tree (PCT)<n>PCT is a full-fledged description of the state and action space of bin packing which can support packing policy learning based on deep reinforcement learning (DRL)
arXiv Detail & Related papers (2025-04-06T09:07:10Z) - Optimizing 2D+1 Packing in Constrained Environments Using Deep Reinforcement Learning [0.6827423171182154]
This paper proposes a novel approach based on deep reinforcement learning (DRL) for the 2D+1 packing problem with spatial constraints.<n>A simulator using the OpenAI Gym framework has been developed to efficiently simulate the packing of rectangular pieces onto two boards with height constraints.
arXiv Detail & Related papers (2025-03-21T23:06:16Z) - Training Deep Learning Models with Norm-Constrained LMOs [56.00317694850397]
We propose a new family of algorithms that uses the linear minimization oracle (LMO) to adapt to the geometry of the problem.<n>We demonstrate significant speedups on nanoGPT training using our algorithm, Scion, without any reliance on Adam.
arXiv Detail & Related papers (2025-02-11T13:10:34Z) - One-Shot Safety Alignment for Large Language Models via Optimal Dualization [64.52223677468861]
This paper presents a perspective of dualization that reduces constrained alignment to an equivalent unconstrained alignment problem.
We do so by pre-optimizing a smooth and convex dual function that has a closed form.
Our strategy leads to two practical algorithms in model-based and preference-based settings.
arXiv Detail & Related papers (2024-05-29T22:12:52Z) - When Parameter-efficient Tuning Meets General-purpose Vision-language
Models [65.19127815275307]
PETAL revolutionizes the training process by requiring only 0.5% of the total parameters, achieved through a unique mode approximation technique.
Our experiments reveal that PETAL not only outperforms current state-of-the-art methods in most scenarios but also surpasses full fine-tuning models in effectiveness.
arXiv Detail & Related papers (2023-12-16T17:13:08Z) - Hybrid Approach for Solving Real-World Bin Packing Problem Instances
Using Quantum Annealers [0.8434687648198277]
We introduce a hybrid quantum-classical framework for solving real-world three-dimensional Bin Packing Problems (Q4RealBPP)
Q4RealBPP permits the solving of real-world oriented instances of 3dBPP, contemplating restrictions well appreciated by industrial and logistics sectors.
arXiv Detail & Related papers (2023-03-01T14:04:50Z) - Penalized Proximal Policy Optimization for Safe Reinforcement Learning [68.86485583981866]
We propose Penalized Proximal Policy Optimization (P3O), which solves the cumbersome constrained policy iteration via a single minimization of an equivalent unconstrained problem.
P3O utilizes a simple-yet-effective penalty function to eliminate cost constraints and removes the trust-region constraint by the clipped surrogate objective.
We show that P3O outperforms state-of-the-art algorithms with respect to both reward improvement and constraint satisfaction on a set of constrained locomotive tasks.
arXiv Detail & Related papers (2022-05-24T06:15:51Z) - Learning Practically Feasible Policies for Online 3D Bin Packing [36.33774915391967]
We tackle the Online 3D Bin Packing Problem, a challenging yet practically useful variant of the classical Bin Packing Problem.
Online 3D-BPP can be naturally formulated as Markov Decision Process (MDP)
We adopt deep reinforcement learning, in particular, the on-policy actor-critic framework, to solve this MDP with constrained action space.
arXiv Detail & Related papers (2021-08-31T08:37:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.