iDet3D: Towards Efficient Interactive Object Detection for LiDAR Point
Clouds
- URL: http://arxiv.org/abs/2312.15449v1
- Date: Sun, 24 Dec 2023 09:59:46 GMT
- Title: iDet3D: Towards Efficient Interactive Object Detection for LiDAR Point
Clouds
- Authors: Dongmin Choi, Wonwoo Cho, Kangyeol Kim, Jaegul Choo
- Abstract summary: We propose iDet3D, an efficient interactive 3D object detector.
iDet3D supports a user-friendly 2D interface, which can ease the cognitive burden of exploring 3D space.
We show that our method can construct precise annotations in a few clicks.
- Score: 39.261055567560724
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Accurately annotating multiple 3D objects in LiDAR scenes is laborious and
challenging. While a few previous studies have attempted to leverage
semi-automatic methods for cost-effective bounding box annotation, such methods
have limitations in efficiently handling numerous multi-class objects. To
effectively accelerate 3D annotation pipelines, we propose iDet3D, an efficient
interactive 3D object detector. Supporting a user-friendly 2D interface, which
can ease the cognitive burden of exploring 3D space to provide click
interactions, iDet3D enables users to annotate the entire objects in each scene
with minimal interactions. Taking the sparse nature of 3D point clouds into
account, we design a negative click simulation (NCS) to improve accuracy by
reducing false-positive predictions. In addition, iDet3D incorporates two click
propagation techniques to take full advantage of user interactions: (1) dense
click guidance (DCG) for keeping user-provided information throughout the
network and (2) spatial click propagation (SCP) for detecting other instances
of the same class based on the user-specified objects. Through our extensive
experiments, we present that our method can construct precise annotations in a
few clicks, which shows the practicality as an efficient annotation tool for 3D
object detection.
Related papers
- AGILE3D: Attention Guided Interactive Multi-object 3D Segmentation [32.63772366307106]
We introduce AGILE3D, an efficient, attention-based model that supports simultaneous segmentation of multiple 3D objects.
Our core idea is to encode user clicks as spatial-temporal queries and enable explicit interactions between click queries and the 3D scene.
In experiments with four different 3D point cloud datasets, AGILE3D sets a new state-of-the-art.
arXiv Detail & Related papers (2023-06-01T17:59:10Z) - CMR3D: Contextualized Multi-Stage Refinement for 3D Object Detection [57.44434974289945]
We propose Contextualized Multi-Stage Refinement for 3D Object Detection (CMR3D) framework.
Our framework takes a 3D scene as input and strives to explicitly integrate useful contextual information of the scene.
In addition to 3D object detection, we investigate the effectiveness of our framework for the problem of 3D object counting.
arXiv Detail & Related papers (2022-09-13T05:26:09Z) - Interactive Object Segmentation in 3D Point Clouds [27.88495480980352]
We present an interactive 3D object segmentation method in which the user interacts directly with the 3D point cloud.
Our model does not require training data from the target domain.
It performs well on several other datasets with different data characteristics as well as different object classes.
arXiv Detail & Related papers (2022-04-14T18:31:59Z) - Monocular Quasi-Dense 3D Object Tracking [99.51683944057191]
A reliable and accurate 3D tracking framework is essential for predicting future locations of surrounding objects and planning the observer's actions in numerous applications such as autonomous driving.
We propose a framework that can effectively associate moving objects over time and estimate their full 3D bounding box information from a sequence of 2D images captured on a moving platform.
arXiv Detail & Related papers (2021-03-12T15:30:02Z) - PLUME: Efficient 3D Object Detection from Stereo Images [95.31278688164646]
Existing methods tackle the problem in two steps: first depth estimation is performed, a pseudo LiDAR point cloud representation is computed from the depth estimates, and then object detection is performed in 3D space.
We propose a model that unifies these two tasks in the same metric space.
Our approach achieves state-of-the-art performance on the challenging KITTI benchmark, with significantly reduced inference time compared with existing methods.
arXiv Detail & Related papers (2021-01-17T05:11:38Z) - Relation3DMOT: Exploiting Deep Affinity for 3D Multi-Object Tracking
from View Aggregation [8.854112907350624]
3D multi-object tracking plays a vital role in autonomous navigation.
Many approaches detect objects in 2D RGB sequences for tracking, which is lack of reliability when localizing objects in 3D space.
We propose a novel convolutional operation, named RelationConv, to better exploit the correlation between each pair of objects in the adjacent frames.
arXiv Detail & Related papers (2020-11-25T16:14:40Z) - D3Feat: Joint Learning of Dense Detection and Description of 3D Local
Features [51.04841465193678]
We leverage a 3D fully convolutional network for 3D point clouds.
We propose a novel and practical learning mechanism that densely predicts both a detection score and a description feature for each 3D point.
Our method achieves state-of-the-art results in both indoor and outdoor scenarios.
arXiv Detail & Related papers (2020-03-06T12:51:09Z) - SESS: Self-Ensembling Semi-Supervised 3D Object Detection [138.80825169240302]
We propose SESS, a self-ensembling semi-supervised 3D object detection framework. Specifically, we design a thorough perturbation scheme to enhance generalization of the network on unlabeled and new unseen data.
Our SESS achieves competitive performance compared to the state-of-the-art fully-supervised method by using only 50% labeled data.
arXiv Detail & Related papers (2019-12-26T08:48:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.