Parcel3D: Shape Reconstruction from Single RGB Images for Applications
in Transportation Logistics
- URL: http://arxiv.org/abs/2304.08994v1
- Date: Tue, 18 Apr 2023 13:55:51 GMT
- Title: Parcel3D: Shape Reconstruction from Single RGB Images for Applications
in Transportation Logistics
- Authors: Alexander Naumann, Felix Hertlein, Laura D\"orr, Kai Furmans
- Abstract summary: We focus on enabling damage and tampering detection in logistics and tackle the problem of 3D shape reconstruction of potentially damaged parcels.
We present a novel synthetic dataset, named Parcel3D, that is based on the Google Scanned Objects (GSO) dataset.
We present a novel architecture called CubeRefine R-CNN, which combines estimating a 3D bounding box with an iterative mesh refinement.
- Score: 62.997667081978825
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We focus on enabling damage and tampering detection in logistics and tackle
the problem of 3D shape reconstruction of potentially damaged parcels. As input
we utilize single RGB images, which corresponds to use-cases where only simple
handheld devices are available, e.g. for postmen during delivery or clients on
delivery. We present a novel synthetic dataset, named Parcel3D, that is based
on the Google Scanned Objects (GSO) dataset and consists of more than 13,000
images of parcels with full 3D annotations. The dataset contains intact, i.e.
cuboid-shaped, parcels and damaged parcels, which were generated in
simulations. We work towards detecting mishandling of parcels by presenting a
novel architecture called CubeRefine R-CNN, which combines estimating a 3D
bounding box with an iterative mesh refinement. We benchmark our approach on
Parcel3D and an existing dataset of cuboid-shaped parcels in real-world
scenarios. Our results show, that while training on Parcel3D enables transfer
to the real world, enabling reliable deployment in real-world scenarios is
still challenging. CubeRefine R-CNN yields competitive performance in terms of
Mesh AP and is the only model that directly enables deformation assessment by
3D mesh comparison and tampering detection by comparing viewpoint invariant
parcel side surface representations. Dataset and code are available at
https://a-nau.github.io/parcel3d.
Related papers
- PokeFlex: A Real-World Dataset of Deformable Objects for Robotics [17.533143584534155]
PokeFlex is a dataset featuring real-world paired and annotated multimodal data that includes 3D textured meshes, point clouds, RGB images, and depth maps.
Such data can be leveraged for several downstream tasks such as online 3D mesh reconstruction.
We demonstrate a use case for the PokeFlex dataset in online 3D mesh reconstruction.
arXiv Detail & Related papers (2024-10-10T07:54:17Z) - CN-RMA: Combined Network with Ray Marching Aggregation for 3D Indoors Object Detection from Multi-view Images [11.152821406076486]
CN-RMA is a novel approach for 3D indoor object detection from multi-view images.
Our method achieves state-of-the-art performance in 3D object detection from multi-view images.
arXiv Detail & Related papers (2024-03-07T03:59:47Z) - CAGroup3D: Class-Aware Grouping for 3D Object Detection on Point Clouds [55.44204039410225]
We present a novel two-stage fully sparse convolutional 3D object detection framework, named CAGroup3D.
Our proposed method first generates some high-quality 3D proposals by leveraging the class-aware local group strategy on the object surface voxels.
To recover the features of missed voxels due to incorrect voxel-wise segmentation, we build a fully sparse convolutional RoI pooling module.
arXiv Detail & Related papers (2022-10-09T13:38:48Z) - Geometry-Contrastive Transformer for Generalized 3D Pose Transfer [95.56457218144983]
The intuition of this work is to perceive the geometric inconsistency between the given meshes with the powerful self-attention mechanism.
We propose a novel geometry-contrastive Transformer that has an efficient 3D structured perceiving ability to the global geometric inconsistencies.
We present a latent isometric regularization module together with a novel semi-synthesized dataset for the cross-dataset 3D pose transfer task.
arXiv Detail & Related papers (2021-12-14T13:14:24Z) - Probabilistic Vehicle Reconstruction Using a Multi-Task CNN [0.0]
We present a probabilistic approach for shape-aware 3D vehicle reconstruction from stereo images.
Specifically, we train a CNN that outputs probability distributions for the vehicle's orientation and for both, vehicle keypoints and wireframe edges.
We show that our method achieves state-of-the-art results, evaluating our method on the challenging KITTI benchmark.
arXiv Detail & Related papers (2021-02-21T20:45:44Z) - H3D: Benchmark on Semantic Segmentation of High-Resolution 3D Point
Clouds and textured Meshes from UAV LiDAR and Multi-View-Stereo [4.263987603222371]
This paper introduces a 3D dataset which is unique in three ways.
It depicts the village of Hessigheim (Germany) henceforth referred to as H3D.
It is designed for promoting research in the field of 3D data analysis on one hand and to evaluate and rank emerging approaches.
arXiv Detail & Related papers (2021-02-10T09:33:48Z) - From Points to Multi-Object 3D Reconstruction [71.17445805257196]
We propose a method to detect and reconstruct multiple 3D objects from a single RGB image.
A keypoint detector localizes objects as center points and directly predicts all object properties, including 9-DoF bounding boxes and 3D shapes.
The presented approach performs lightweight reconstruction in a single-stage, it is real-time capable, fully differentiable and end-to-end trainable.
arXiv Detail & Related papers (2020-12-21T18:52:21Z) - Reinforced Axial Refinement Network for Monocular 3D Object Detection [160.34246529816085]
Monocular 3D object detection aims to extract the 3D position and properties of objects from a 2D input image.
Conventional approaches sample 3D bounding boxes from the space and infer the relationship between the target object and each of them, however, the probability of effective samples is relatively small in the 3D space.
We propose to start with an initial prediction and refine it gradually towards the ground truth, with only one 3d parameter changed in each step.
This requires designing a policy which gets a reward after several steps, and thus we adopt reinforcement learning to optimize it.
arXiv Detail & Related papers (2020-08-31T17:10:48Z) - Factor Graph based 3D Multi-Object Tracking in Point Clouds [8.411514688735183]
We propose a novel optimization-based approach that does not rely on explicit and fixed assignments.
We demonstrate its performance on the real world KITTI tracking dataset and achieve better results than many state-of-the-art algorithms.
arXiv Detail & Related papers (2020-08-12T13:34:46Z) - DOPS: Learning to Detect 3D Objects and Predict their 3D Shapes [54.239416488865565]
We propose a fast single-stage 3D object detection method for LIDAR data.
The core novelty of our method is a fast, single-pass architecture that both detects objects in 3D and estimates their shapes.
We find that our proposed method achieves state-of-the-art results by 5% on object detection in ScanNet scenes, and it gets top results by 3.4% in the Open dataset.
arXiv Detail & Related papers (2020-04-02T17:48:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.