Learning Practically Feasible Policies for Online 3D Bin Packing
- URL: http://arxiv.org/abs/2108.13680v3
- Date: Fri, 2 Jun 2023 10:59:59 GMT
- Title: Learning Practically Feasible Policies for Online 3D Bin Packing
- Authors: Hang Zhao, Chenyang Zhu, Xin Xu, Hui Huang, Kai Xu
- Abstract summary: We tackle the Online 3D Bin Packing Problem, a challenging yet practically useful variant of the classical Bin Packing Problem.
Online 3D-BPP can be naturally formulated as Markov Decision Process (MDP)
We adopt deep reinforcement learning, in particular, the on-policy actor-critic framework, to solve this MDP with constrained action space.
- Score: 36.33774915391967
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We tackle the Online 3D Bin Packing Problem, a challenging yet practically
useful variant of the classical Bin Packing Problem. In this problem, the items
are delivered to the agent without informing the full sequence information.
Agent must directly pack these items into the target bin stably without
changing their arrival order, and no further adjustment is permitted. Online
3D-BPP can be naturally formulated as Markov Decision Process (MDP). We adopt
deep reinforcement learning, in particular, the on-policy actor-critic
framework, to solve this MDP with constrained action space. To learn a
practically feasible packing policy, we propose three critical designs. First,
we propose an online analysis of packing stability based on a novel stacking
tree. It attains a high analysis accuracy while reducing the computational
complexity from $O(N^2)$ to $O(N \log N)$, making it especially suited for RL
training. Second, we propose a decoupled packing policy learning for different
dimensions of placement which enables high-resolution spatial discretization
and hence high packing precision. Third, we introduce a reward function that
dictates the robot to place items in a far-to-near order and therefore
simplifies the collision avoidance in movement planning of the robotic arm.
Furthermore, we provide a comprehensive discussion on several key implemental
issues. The extensive evaluation demonstrates that our learned policy
outperforms the state-of-the-art methods significantly and is practically
usable for real-world applications.
Related papers
- Deep Reinforcement Learning for Traveling Purchaser Problems [63.37136587778153]
The traveling purchaser problem (TPP) is an important optimization problem with broad applications.
We propose a novel approach based on deep reinforcement learning (DRL), which addresses route construction and purchase planning separately.
By introducing a meta-learning strategy, the policy network can be trained stably on large-sized TPP instances.
arXiv Detail & Related papers (2024-04-03T05:32:10Z) - Neural Packing: from Visual Sensing to Reinforcement Learning [24.35678534893451]
We present a novel learning framework to solve the transport-and-packing (TAP) problem in 3D.
It constitutes a full solution pipeline from partial observations of input objects via RGBD sensing and recognition to final box placement, via robotic motion planning, to arrive at a compact packing in a target container.
arXiv Detail & Related papers (2023-10-17T02:42:54Z) - When is Agnostic Reinforcement Learning Statistically Tractable? [76.1408672715773]
A new complexity measure, called the emphspanning capacity, depends solely on the set $Pi$ and is independent of the MDP dynamics.
We show there exists a policy class $Pi$ with a bounded spanning capacity that requires a superpolynomial number of samples to learn.
This reveals a surprising separation for learnability between generative access and online access models.
arXiv Detail & Related papers (2023-10-09T19:40:54Z) - Adjustable Robust Reinforcement Learning for Online 3D Bin Packing [11.157035538606968]
Current deep reinforcement learning methods for online 3D-BPP fail in real-world settings where some worst-case scenarios can materialize.
We propose an adjustable robust reinforcement learning framework that allows efficient adjustment of robustness weights.
Experiments demonstrate that AR2L is versatile in the sense that it improves policy robustness while maintaining an acceptable level of performance for the nominal case.
arXiv Detail & Related papers (2023-10-06T15:34:21Z) - Learning Physically Realizable Skills for Online Packing of General 3D
Shapes [41.27652080050046]
We study the problem of learning online packing skills for irregular 3D shapes.
The goal is to consecutively move a sequence of 3D objects with arbitrary shapes into a designated container.
We take physical realizability into account, involving physics dynamics and constraints of a placement.
arXiv Detail & Related papers (2022-12-05T08:23:39Z) - Planning Irregular Object Packing via Hierarchical Reinforcement
Learning [85.64313062912491]
We propose a deep hierarchical reinforcement learning approach to plan packing sequence and placement for irregular objects.
We show that our approach can pack more objects with less time cost than the state-of-the-art packing methods of irregular objects.
arXiv Detail & Related papers (2022-11-17T07:16:37Z) - Planning to Practice: Efficient Online Fine-Tuning by Composing Goals in
Latent Space [76.46113138484947]
General-purpose robots require diverse repertoires of behaviors to complete challenging tasks in real-world unstructured environments.
To address this issue, goal-conditioned reinforcement learning aims to acquire policies that can reach goals for a wide range of tasks on command.
We propose Planning to Practice, a method that makes it practical to train goal-conditioned policies for long-horizon tasks.
arXiv Detail & Related papers (2022-05-17T06:58:17Z) - POMP: Pomcp-based Online Motion Planning for active visual search in
indoor environments [89.43830036483901]
We focus on the problem of learning an optimal policy for Active Visual Search (AVS) of objects in known indoor environments with an online setup.
Our POMP method uses as input the current pose of an agent and a RGB-D frame.
We validate our method on the publicly available AVD benchmark, achieving an average success rate of 0.76 with an average path length of 17.1.
arXiv Detail & Related papers (2020-09-17T08:23:50Z) - A Generalized Reinforcement Learning Algorithm for Online 3D Bin-Packing [7.79020719611004]
We propose a Deep Reinforcement Learning (Deep RL) algorithm for solving the online 3D bin packing problem.
The focus is on producing decisions that can be physically implemented by a robotic loading arm.
We show that the RL-based method outperforms state-of-the-art online bin packings in terms of empirical competitive ratio and volume efficiency.
arXiv Detail & Related papers (2020-07-01T13:02:04Z) - Online 3D Bin Packing with Constrained Deep Reinforcement Learning [27.656959508214193]
We solve a challenging yet practically useful variant of 3D Bin Packing Problem (3D-BPP)
In our problem, the agent has limited information about the items to be packed into the bin, and an item must be packed immediately after its arrival without buffering or readjusting.
We propose an effective and easy-to-implement constrained deep reinforcement learning (DRL) method under the actor-critic framework.
arXiv Detail & Related papers (2020-06-26T13:28:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.