Related papers: Learning Practically Feasible Policies for Online 3D Bin Packing

Learning Practically Feasible Policies for Online 3D Bin Packing

URL: http://arxiv.org/abs/2108.13680v3
Date: Fri, 2 Jun 2023 10:59:59 GMT
Title: Learning Practically Feasible Policies for Online 3D Bin Packing
Authors: Hang Zhao, Chenyang Zhu, Xin Xu, Hui Huang, Kai Xu
Abstract summary: We tackle the Online 3D Bin Packing Problem, a challenging yet practically useful variant of the classical Bin Packing Problem. Online 3D-BPP can be naturally formulated as Markov Decision Process (MDP) We adopt deep reinforcement learning, in particular, the on-policy actor-critic framework, to solve this MDP with constrained action space.
Score: 36.33774915391967
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We tackle the Online 3D Bin Packing Problem, a challenging yet practically useful variant of the classical Bin Packing Problem. In this problem, the items are delivered to the agent without informing the full sequence information. Agent must directly pack these items into the target bin stably without changing their arrival order, and no further adjustment is permitted. Online 3D-BPP can be naturally formulated as Markov Decision Process (MDP). We adopt deep reinforcement learning, in particular, the on-policy actor-critic framework, to solve this MDP with constrained action space. To learn a practically feasible packing policy, we propose three critical designs. First, we propose an online analysis of packing stability based on a novel stacking tree. It attains a high analysis accuracy while reducing the computational complexity from $O(N^2)$ to $O(N \log N)$, making it especially suited for RL training. Second, we propose a decoupled packing policy learning for different dimensions of placement which enables high-resolution spatial discretization and hence high packing precision. Third, we introduce a reward function that dictates the robot to place items in a far-to-near order and therefore simplifies the collision avoidance in movement planning of the robotic arm. Furthermore, we provide a comprehensive discussion on several key implemental issues. The extensive evaluation demonstrates that our learned policy outperforms the state-of-the-art methods significantly and is practically usable for real-world applications.

Related papers

Deliberate Planning of 3D Bin Packing on Packing Configuration Trees [40.46267029657914]
Online 3D Bin Packing Problem (3D-BPP) has widespread applications in industrial automation. We propose to enhance the practical applicability of online 3D-BPP via learning on a novel hierarchical representation, packing configuration tree (PCT) PCT is a full-fledged description of the state and action space of bin packing which can support packing policy learning based on deep reinforcement learning (DRL)
arXiv Detail & Related papers (2025-04-06T09:07:10Z)
Deep Reinforcement Learning for Traveling Purchaser Problems [63.37136587778153]
The traveling purchaser problem (TPP) is an important optimization problem with broad applications. We propose a novel approach based on deep reinforcement learning (DRL), which addresses route construction and purchase planning separately. By introducing a meta-learning strategy, the policy network can be trained stably on large-sized TPP instances.
arXiv Detail & Related papers (2024-04-03T05:32:10Z)
Neural Packing: from Visual Sensing to Reinforcement Learning [24.35678534893451]
We present a novel learning framework to solve the transport-and-packing (TAP) problem in 3D. It constitutes a full solution pipeline from partial observations of input objects via RGBD sensing and recognition to final box placement, via robotic motion planning, to arrive at a compact packing in a target container.
arXiv Detail & Related papers (2023-10-17T02:42:54Z)
When is Agnostic Reinforcement Learning Statistically Tractable? [76.1408672715773]
A new complexity measure, called the emphspanning capacity, depends solely on the set $Pi$ and is independent of the MDP dynamics. We show there exists a policy class $Pi$ with a bounded spanning capacity that requires a superpolynomial number of samples to learn. This reveals a surprising separation for learnability between generative access and online access models.
arXiv Detail & Related papers (2023-10-09T19:40:54Z)
Adjustable Robust Reinforcement Learning for Online 3D Bin Packing [11.157035538606968]
Current deep reinforcement learning methods for online 3D-BPP fail in real-world settings where some worst-case scenarios can materialize. We propose an adjustable robust reinforcement learning framework that allows efficient adjustment of robustness weights. Experiments demonstrate that AR2L is versatile in the sense that it improves policy robustness while maintaining an acceptable level of performance for the nominal case.
arXiv Detail & Related papers (2023-10-06T15:34:21Z)
Learning Physically Realizable Skills for Online Packing of General 3D Shapes [41.27652080050046]
We study the problem of learning online packing skills for irregular 3D shapes. The goal is to consecutively move a sequence of 3D objects with arbitrary shapes into a designated container. We take physical realizability into account, involving physics dynamics and constraints of a placement.
arXiv Detail & Related papers (2022-12-05T08:23:39Z)
Planning Irregular Object Packing via Hierarchical Reinforcement Learning [85.64313062912491]
We propose a deep hierarchical reinforcement learning approach to plan packing sequence and placement for irregular objects. We show that our approach can pack more objects with less time cost than the state-of-the-art packing methods of irregular objects.
arXiv Detail & Related papers (2022-11-17T07:16:37Z)
Planning to Practice: Efficient Online Fine-Tuning by Composing Goals in Latent Space [76.46113138484947]
General-purpose robots require diverse repertoires of behaviors to complete challenging tasks in real-world unstructured environments. To address this issue, goal-conditioned reinforcement learning aims to acquire policies that can reach goals for a wide range of tasks on command. We propose Planning to Practice, a method that makes it practical to train goal-conditioned policies for long-horizon tasks.
arXiv Detail & Related papers (2022-05-17T06:58:17Z)
POMP: Pomcp-based Online Motion Planning for active visual search in indoor environments [89.43830036483901]
We focus on the problem of learning an optimal policy for Active Visual Search (AVS) of objects in known indoor environments with an online setup. Our POMP method uses as input the current pose of an agent and a RGB-D frame. We validate our method on the publicly available AVD benchmark, achieving an average success rate of 0.76 with an average path length of 17.1.
arXiv Detail & Related papers (2020-09-17T08:23:50Z)
A Generalized Reinforcement Learning Algorithm for Online 3D Bin-Packing [7.79020719611004]
We propose a Deep Reinforcement Learning (Deep RL) algorithm for solving the online 3D bin packing problem. The focus is on producing decisions that can be physically implemented by a robotic loading arm. We show that the RL-based method outperforms state-of-the-art online bin packings in terms of empirical competitive ratio and volume efficiency.
arXiv Detail & Related papers (2020-07-01T13:02:04Z)
Online 3D Bin Packing with Constrained Deep Reinforcement Learning [27.656959508214193]
We solve a challenging yet practically useful variant of 3D Bin Packing Problem (3D-BPP) In our problem, the agent has limited information about the items to be packed into the bin, and an item must be packed immediately after its arrival without buffering or readjusting. We propose an effective and easy-to-implement constrained deep reinforcement learning (DRL) method under the actor-critic framework.
arXiv Detail & Related papers (2020-06-26T13:28:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.