3D Annotation Of Arbitrary Objects In The Wild
- URL: http://arxiv.org/abs/2109.07165v1
- Date: Wed, 15 Sep 2021 09:00:56 GMT
- Title: 3D Annotation Of Arbitrary Objects In The Wild
- Authors: Kenneth Blomqvist, Julius Hietala
- Abstract summary: We propose a data annotation pipeline based on SLAM, 3D reconstruction, and 3D-to-2D geometry.
The pipeline allows creating 3D and 2D bounding boxes, along with per-pixel annotations of arbitrary objects.
Our results showcase almost 90% Intersection-over-Union (IoU) agreement on both semantic segmentation and 2D bounding box detection.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent years have produced a variety of learning based methods in the context
of computer vision and robotics. Most of the recently proposed methods are
based on deep learning, which require very large amounts of data compared to
traditional methods. The performance of the deep learning methods are largely
dependent on the data distribution they were trained on, and it is important to
use data from the robot's actual operating domain during training. Therefore,
it is not possible to rely on pre-built, generic datasets when deploying robots
in real environments, creating a need for efficient data collection and
annotation in the specific operating conditions the robots will operate in. The
challenge is then: how do we reduce the cost of obtaining such datasets to a
point where we can easily deploy our robots in new conditions, environments and
to support new sensors? As an answer to this question, we propose a data
annotation pipeline based on SLAM, 3D reconstruction, and 3D-to-2D geometry.
The pipeline allows creating 3D and 2D bounding boxes, along with per-pixel
annotations of arbitrary objects without needing accurate 3D models of the
objects prior to data collection and annotation. Our results showcase almost
90% Intersection-over-Union (IoU) agreement on both semantic segmentation and
2D bounding box detection across a variety of objects and scenes, while
speeding up the annotation process by several orders of magnitude compared to
traditional manual annotation.
Related papers
- DatasetNeRF: Efficient 3D-aware Data Factory with Generative Radiance Fields [68.94868475824575]
This paper introduces a novel approach capable of generating infinite, high-quality 3D-consistent 2D annotations alongside 3D point cloud segmentations.
We leverage the strong semantic prior within a 3D generative model to train a semantic decoder.
Once trained, the decoder efficiently generalizes across the latent space, enabling the generation of infinite data.
arXiv Detail & Related papers (2023-11-18T21:58:28Z) - Weakly Supervised 3D Object Detection with Multi-Stage Generalization [62.96670547848691]
We introduce BA$2$-Det, encompassing pseudo label generation and multi-stage generalization.
We develop three stages of generalization: progressing from complete to partial, static to dynamic, and close to distant.
BA$2$-Det can achieve a 20% relative improvement on the KITTI dataset.
arXiv Detail & Related papers (2023-06-08T17:58:57Z) - DR-WLC: Dimensionality Reduction cognition for object detection and pose
estimation by Watching, Learning and Checking [30.58114448119465]
Existing object detection and pose estimation methods mostly adopt the same-dimensional data for training.
DR-WLC, a dimensionality reduction cognitive model, can perform both object detection and pose estimation tasks at the same time.
arXiv Detail & Related papers (2023-01-17T15:08:32Z) - Lifting 2D Object Locations to 3D by Discounting LiDAR Outliers across
Objects and Views [70.1586005070678]
We present a system for automatically converting 2D mask object predictions and raw LiDAR point clouds into full 3D bounding boxes of objects.
Our method significantly outperforms previous work despite the fact that those methods use significantly more complex pipelines, 3D models and additional human-annotated external sources of prior information.
arXiv Detail & Related papers (2021-09-16T13:01:13Z) - RandomRooms: Unsupervised Pre-training from Synthetic Shapes and
Randomized Layouts for 3D Object Detection [138.2892824662943]
A promising solution is to make better use of the synthetic dataset, which consists of CAD object models, to boost the learning on real datasets.
Recent work on 3D pre-training exhibits failure when transfer features learned on synthetic objects to other real-world applications.
In this work, we put forward a new method called RandomRooms to accomplish this objective.
arXiv Detail & Related papers (2021-08-17T17:56:12Z) - REGRAD: A Large-Scale Relational Grasp Dataset for Safe and
Object-Specific Robotic Grasping in Clutter [52.117388513480435]
We present a new dataset named regrad to sustain the modeling of relationships among objects and grasps.
Our dataset is collected in both forms of 2D images and 3D point clouds.
Users are free to import their own object models for the generation of as many data as they want.
arXiv Detail & Related papers (2021-04-29T05:31:21Z) - Unsupervised Learning of 3D Object Categories from Videos in the Wild [75.09720013151247]
We focus on learning a model from multiple views of a large collection of object instances.
We propose a new neural network design, called warp-conditioned ray embedding (WCR), which significantly improves reconstruction.
Our evaluation demonstrates performance improvements over several deep monocular reconstruction baselines on existing benchmarks.
arXiv Detail & Related papers (2021-03-30T17:57:01Z) - Bridging the Reality Gap for Pose Estimation Networks using Sensor-Based
Domain Randomization [1.4290119665435117]
Methods trained on synthetic data use 2D images, as domain randomization in 2D is more developed.
Our method integrates the 3D data into the network to increase the accuracy of the pose estimation.
Experiments on three large pose estimation benchmarks show that the presented method outperforms previous methods trained on synthetic data.
arXiv Detail & Related papers (2020-11-17T09:12:11Z) - 3D for Free: Crossmodal Transfer Learning using HD Maps [36.70550754737353]
We leverage the large class-taxonomies of modern 2D datasets and the robustness of state-of-the-art 2D detection methods.
We mine a collection of 1151 unlabeled, multimodal driving logs from an autonomous vehicle.
We show that detector performance increases as we mine more unlabeled data.
arXiv Detail & Related papers (2020-08-24T17:54:51Z) - Self-Supervised Object-in-Gripper Segmentation from Robotic Motions [27.915309216800125]
We propose a robust solution for learning to segment unknown objects grasped by a robot.
We exploit motion and temporal cues in RGB video sequences.
Our approach is fully self-supervised and independent of precise camera calibration, 3D models or potentially imperfect depth data.
arXiv Detail & Related papers (2020-02-11T15:44:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.