Related papers: One4Many-StablePacker: An Efficient Deep Reinforcement Learning Framework for the 3D Bin Packing Problem

One4Many-StablePacker: An Efficient Deep Reinforcement Learning Framework for the 3D Bin Packing Problem

URL: http://arxiv.org/abs/2510.10057v1
Date: Sat, 11 Oct 2025 06:47:49 GMT
Title: One4Many-StablePacker: An Efficient Deep Reinforcement Learning Framework for the 3D Bin Packing Problem
Authors: Lei Gao, Shihong Huang, Shengjie Wang, Hong Ma, Feng Zhang, Hengda Bao, Qichang Chen, Weihua Zhou,
Abstract summary: Three-dimensional bin packing problem (3D-BPP) is widely applied in logistics and warehousing.<n>We propose a novel deep reinforcement learning framework, One4Many-StablePacker (O4M-SP)<n>O4M-SP is able to handle various bin dimensions in a single training process while incorporating support and weight constraints common in practice.
Score: 12.516955835907089
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The three-dimensional bin packing problem (3D-BPP) is widely applied in logistics and warehousing. Existing learning-based approaches often neglect practical stability-related constraints and exhibit limitations in generalizing across diverse bin dimensions. To address these limitations, we propose a novel deep reinforcement learning framework, One4Many-StablePacker (O4M-SP). The primary advantage of O4M-SP is its ability to handle various bin dimensions in a single training process while incorporating support and weight constraints common in practice. Our training method introduces two innovative mechanisms. First, it employs a weighted reward function that integrates loading rate and a new height difference metric for packing layouts, promoting improved bin utilization through flatter packing configurations. Second, it combines clipped policy gradient optimization with a tailored policy drifting method to mitigate policy entropy collapse, encouraging exploration at critical decision nodes during packing to avoid suboptimal solutions. Extensive experiments demonstrate that O4M-SP generalizes successfully across diverse bin dimensions and significantly outperforms baseline methods. Furthermore, O4M-SP exhibits strong practical applicability by effectively addressing packing scenarios with stability constraints.

Related papers

SlimPack: Fine-Grained Asymmetric Packing for Balanced and Efficient Variable-Length LLM Training [22.230495941666096]
We introduce SlimPack, a framework that fundamentally rethinks data packing and scheduling by decomposing samples into fine-grained slices.<n>SlimPack mitigates critical memory and communication bottlenecks by transforming large, volatile workloads into a stream of smaller, manageable units.<n>Asymmetric Partitioning assembles balanced scheduling units uniquely optimized for the different demands of the forward and backward passes.
arXiv Detail & Related papers (2025-09-30T13:37:48Z)
Discrete-Guided Diffusion for Scalable and Safe Multi-Robot Motion Planning [56.240199425429445]
Multi-Robot Motion Planning (MPMP) involves generating trajectories for multiple robots operating in a shared continuous workspace.<n>While discrete multi-agent finding (MAPF) methods are broadly adopted due to their scalability, their coarse discretization trajectory quality.<n>This paper tackles limitations of two approaches by introducing discrete MAPF solvers with constrained generative diffusion models.
arXiv Detail & Related papers (2025-08-27T17:59:36Z)
I$^3$-MRec: Invariant Learning with Information Bottleneck for Incomplete Modality Recommendation [56.55935146424585]
We introduce textbfI$3$-MRec, which learns with textbfInformation bottleneck principle for textbfIncomplete textbfModality textbfRecommendation.<n>By treating each modality as a distinct semantic environment, I$3$-MRec employs invariant risk minimization (IRM) to learn preference-oriented representations.<n>I$3$-MRec consistently outperforms existing state-of-the-art MRS methods across various modality-missing scenarios
arXiv Detail & Related papers (2025-08-06T09:29:50Z)
Deliberate Planning of 3D Bin Packing on Packing Configuration Trees [65.05353662124676]
Online 3D Bin Packing Problem (3D-BPP) has widespread applications in industrial automation.<n>We propose to enhance the practical applicability of online 3D-BPP via learning on a novel hierarchical representation, packing configuration tree (PCT)<n>PCT is a full-fledged description of the state and action space of bin packing which can support packing policy learning based on deep reinforcement learning (DRL)
arXiv Detail & Related papers (2025-04-06T09:07:10Z)
Optimizing 2D+1 Packing in Constrained Environments Using Deep Reinforcement Learning [0.6827423171182154]
This paper proposes a novel approach based on deep reinforcement learning (DRL) for the 2D+1 packing problem with spatial constraints.<n>A simulator using the OpenAI Gym framework has been developed to efficiently simulate the packing of rectangular pieces onto two boards with height constraints.
arXiv Detail & Related papers (2025-03-21T23:06:16Z)
Training Deep Learning Models with Norm-Constrained LMOs [56.00317694850397]
We propose a new family of algorithms that uses the linear minimization oracle (LMO) to adapt to the geometry of the problem.<n>We demonstrate significant speedups on nanoGPT training using our algorithm, Scion, without any reliance on Adam.
arXiv Detail & Related papers (2025-02-11T13:10:34Z)
One-Shot Safety Alignment for Large Language Models via Optimal Dualization [64.52223677468861]
This paper presents a perspective of dualization that reduces constrained alignment to an equivalent unconstrained alignment problem. We do so by pre-optimizing a smooth and convex dual function that has a closed form. Our strategy leads to two practical algorithms in model-based and preference-based settings.
arXiv Detail & Related papers (2024-05-29T22:12:52Z)
When Parameter-efficient Tuning Meets General-purpose Vision-language Models [65.19127815275307]
PETAL revolutionizes the training process by requiring only 0.5% of the total parameters, achieved through a unique mode approximation technique. Our experiments reveal that PETAL not only outperforms current state-of-the-art methods in most scenarios but also surpasses full fine-tuning models in effectiveness.
arXiv Detail & Related papers (2023-12-16T17:13:08Z)
Hybrid Approach for Solving Real-World Bin Packing Problem Instances Using Quantum Annealers [0.8434687648198277]
We introduce a hybrid quantum-classical framework for solving real-world three-dimensional Bin Packing Problems (Q4RealBPP) Q4RealBPP permits the solving of real-world oriented instances of 3dBPP, contemplating restrictions well appreciated by industrial and logistics sectors.
arXiv Detail & Related papers (2023-03-01T14:04:50Z)
Penalized Proximal Policy Optimization for Safe Reinforcement Learning [68.86485583981866]
We propose Penalized Proximal Policy Optimization (P3O), which solves the cumbersome constrained policy iteration via a single minimization of an equivalent unconstrained problem. P3O utilizes a simple-yet-effective penalty function to eliminate cost constraints and removes the trust-region constraint by the clipped surrogate objective. We show that P3O outperforms state-of-the-art algorithms with respect to both reward improvement and constraint satisfaction on a set of constrained locomotive tasks.
arXiv Detail & Related papers (2022-05-24T06:15:51Z)
Learning Practically Feasible Policies for Online 3D Bin Packing [36.33774915391967]
We tackle the Online 3D Bin Packing Problem, a challenging yet practically useful variant of the classical Bin Packing Problem. Online 3D-BPP can be naturally formulated as Markov Decision Process (MDP) We adopt deep reinforcement learning, in particular, the on-policy actor-critic framework, to solve this MDP with constrained action space.
arXiv Detail & Related papers (2021-08-31T08:37:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.