Related papers: Neural Packing: from Visual Sensing to Reinforcement Learning

Neural Packing: from Visual Sensing to Reinforcement Learning

URL: http://arxiv.org/abs/2311.09233v1
Date: Tue, 17 Oct 2023 02:42:54 GMT
Title: Neural Packing: from Visual Sensing to Reinforcement Learning
Authors: Juzhan Xu, Minglun Gong, Hao Zhang, Hui Huang, Ruizhen Hu
Abstract summary: We present a novel learning framework to solve the transport-and-packing (TAP) problem in 3D. It constitutes a full solution pipeline from partial observations of input objects via RGBD sensing and recognition to final box placement, via robotic motion planning, to arrive at a compact packing in a target container.
Score: 24.35678534893451
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present a novel learning framework to solve the transport-and-packing (TAP) problem in 3D. It constitutes a full solution pipeline from partial observations of input objects via RGBD sensing and recognition to final box placement, via robotic motion planning, to arrive at a compact packing in a target container. The technical core of our method is a neural network for TAP, trained via reinforcement learning (RL), to solve the NP-hard combinatorial optimization problem. Our network simultaneously selects an object to pack and determines the final packing location, based on a judicious encoding of the continuously evolving states of partially observed source objects and available spaces in the target container, using separate encoders both enabled with attention mechanisms. The encoded feature vectors are employed to compute the matching scores and feasibility masks of different pairings of box selection and available space configuration for packing strategy optimization. Extensive experiments, including ablation studies and physical packing execution by a real robot (Universal Robot UR5e), are conducted to evaluate our method in terms of its design choices, scalability, generalizability, and comparisons to baselines, including the most recent RL-based TAP solution. We also contribute the first benchmark for TAP which covers a variety of input settings and difficulty levels.

Related papers

Optimizing Cooperative Multi-Object Tracking using Graph Signal Processing [45.68287260385148]
This paper proposes a novel Cooperative MOT framework for tracking objects in 3D LiDAR scene.<n>By exploiting a fully connected graph topology defined by the detected bounding boxes, we employ the Graph Laplacian processing optimization technique.<n>An extensive evaluation study has been conducted, using the real-world V2V4Real dataset.
arXiv Detail & Related papers (2025-06-11T07:21:58Z)
Automated and Holistic Co-design of Neural Networks and ASICs for Enabling In-Pixel Intelligence [4.063480188363124]
Extreme edge-AI systems, such as those in readout ASICs for radiation detection, must operate under stringent hardware constraints. Finding ideal solutions means identifying optimal AI and ASIC design choices from a design space that has explosively expanded.
arXiv Detail & Related papers (2024-07-18T17:58:05Z)
Find n' Propagate: Open-Vocabulary 3D Object Detection in Urban Environments [67.83787474506073]
We tackle the limitations of current LiDAR-based 3D object detection systems. We introduce a universal textscFind n' Propagate approach for 3D OV tasks. We achieve up to a 3.97-fold increase in Average Precision (AP) for novel object classes.
arXiv Detail & Related papers (2024-03-20T12:51:30Z)
Geometric-aware Pretraining for Vision-centric 3D Object Detection [77.7979088689944]
We propose a novel geometric-aware pretraining framework called GAPretrain. GAPretrain serves as a plug-and-play solution that can be flexibly applied to multiple state-of-the-art detectors. We achieve 46.2 mAP and 55.5 NDS on the nuScenes val set using the BEVFormer method, with a gain of 2.7 and 2.1 points, respectively.
arXiv Detail & Related papers (2023-04-06T14:33:05Z)
Task-Oriented Sensing, Computation, and Communication Integration for Multi-Device Edge AI [108.08079323459822]
This paper studies a new multi-intelligent edge artificial-latency (AI) system, which jointly exploits the AI model split inference and integrated sensing and communication (ISAC) We measure the inference accuracy by adopting an approximate but tractable metric, namely discriminant gain.
arXiv Detail & Related papers (2022-07-03T06:57:07Z)
Aligning Pretraining for Detection via Object-Level Contrastive Learning [57.845286545603415]
Image-level contrastive representation learning has proven to be highly effective as a generic model for transfer learning. We argue that this could be sub-optimal and thus advocate a design principle which encourages alignment between the self-supervised pretext task and the downstream task. Our method, called Selective Object COntrastive learning (SoCo), achieves state-of-the-art results for transfer performance on COCO detection.
arXiv Detail & Related papers (2021-06-04T17:59:52Z)
TAP-Net: Transport-and-Pack using Reinforcement Learning [25.884588673613244]
We introduce the transport-and-pack(TAP) problem, a frequently encountered instance of real-world packing. We develop a neural optimization solution based on reinforcement learning. We show that our network generalizes well to larger problem instances, when trained on small-sized inputs.
arXiv Detail & Related papers (2020-09-03T06:20:17Z)
Risk-Averse MPC via Visual-Inertial Input and Recurrent Networks for Online Collision Avoidance [95.86944752753564]
We propose an online path planning architecture that extends the model predictive control (MPC) formulation to consider future location uncertainties. Our algorithm combines an object detection pipeline with a recurrent neural network (RNN) which infers the covariance of state estimates. The robustness of our methods is validated on complex quadruped robot dynamics and can be generally applied to most robotic platforms.
arXiv Detail & Related papers (2020-07-28T07:34:30Z)
A Generalized Reinforcement Learning Algorithm for Online 3D Bin-Packing [7.79020719611004]
We propose a Deep Reinforcement Learning (Deep RL) algorithm for solving the online 3D bin packing problem. The focus is on producing decisions that can be physically implemented by a robotic loading arm. We show that the RL-based method outperforms state-of-the-art online bin packings in terms of empirical competitive ratio and volume efficiency.
arXiv Detail & Related papers (2020-07-01T13:02:04Z)
MOPS-Net: A Matrix Optimization-driven Network forTask-Oriented 3D Point Cloud Downsampling [86.42733428762513]
MOPS-Net is a novel interpretable deep learning-based method for matrix optimization. We show that MOPS-Net can achieve favorable performance against state-of-the-art deep learning-based methods over various tasks.
arXiv Detail & Related papers (2020-05-01T14:01:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.