Towards Auto-Annotation from Annotation Guidelines: A Benchmark through 3D LiDAR Detection
- URL: http://arxiv.org/abs/2506.02914v1
- Date: Tue, 03 Jun 2025 14:17:37 GMT
- Title: Towards Auto-Annotation from Annotation Guidelines: A Benchmark through 3D LiDAR Detection
- Authors: Yechi Ma, Wei Hua, Shu Kong,
- Abstract summary: AnnoGuide aims to evaluate automated methods for data annotation directly from expert-defined annotation guidelines.<n>It is a novel task of multi-modal few-shot 3D detection without 3D annotations.<n>Our results highlight that AnnoGuide remains an open and challenging problem, underscoring the urgent need for developing LiDAR-based FMs.
- Score: 12.532548019177604
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A crucial yet under-appreciated prerequisite in machine learning solutions for real-applications is data annotation: human annotators are hired to manually label data according to detailed, expert-crafted guidelines. This is often a laborious, tedious, and costly process. To study methods for facilitating data annotation, we introduce a new benchmark AnnoGuide: Auto-Annotation from Annotation Guidelines. It aims to evaluate automated methods for data annotation directly from expert-defined annotation guidelines, eliminating the need for manual labeling. As a case study, we repurpose the well-established nuScenes dataset, commonly used in autonomous driving research, which provides comprehensive annotation guidelines for labeling LiDAR point clouds with 3D cuboids across 18 object classes. These guidelines include a few visual examples and textual descriptions, but no labeled 3D cuboids in LiDAR data, making this a novel task of multi-modal few-shot 3D detection without 3D annotations. The advances of powerful foundation models (FMs) make AnnoGuide especially timely, as FMs offer promising tools to tackle its challenges. We employ a conceptually straightforward pipeline that (1) utilizes open-source FMs for object detection and segmentation in RGB images, (2) projects 2D detections into 3D using known camera poses, and (3) clusters LiDAR points within the frustum of each 2D detection to generate a 3D cuboid. Starting with a non-learned solution that leverages off-the-shelf FMs, we progressively refine key components and achieve significant performance improvements, boosting 3D detection mAP from 12.1 to 21.9! Nevertheless, our results highlight that AnnoGuide remains an open and challenging problem, underscoring the urgent need for developing LiDAR-based FMs. We release our code and models at GitHub: https://annoguide.github.io/annoguide3Dbenchmark
Related papers
- ALPI: Auto-Labeller with Proxy Injection for 3D Object Detection using 2D Labels Only [5.699475977818167]
3D object detection plays a crucial role in various applications such as autonomous vehicles, robotics and augmented reality.<n>We propose a weakly supervised 3D annotator that relies solely on 2D bounding box annotations from images, along with size priors.
arXiv Detail & Related papers (2024-07-24T11:58:31Z) - Weakly Supervised 3D Object Detection via Multi-Level Visual Guidance [72.6809373191638]
We propose a framework to study how to leverage constraints between 2D and 3D domains without requiring any 3D labels.
Specifically, we design a feature-level constraint to align LiDAR and image features based on object-aware regions.
Second, the output-level constraint is developed to enforce the overlap between 2D and projected 3D box estimations.
Third, the training-level constraint is utilized by producing accurate and consistent 3D pseudo-labels that align with the visual data.
arXiv Detail & Related papers (2023-12-12T18:57:25Z) - Weakly Supervised 3D Object Detection with Multi-Stage Generalization [62.96670547848691]
We introduce BA$2$-Det, encompassing pseudo label generation and multi-stage generalization.
We develop three stages of generalization: progressing from complete to partial, static to dynamic, and close to distant.
BA$2$-Det can achieve a 20% relative improvement on the KITTI dataset.
arXiv Detail & Related papers (2023-06-08T17:58:57Z) - Once Detected, Never Lost: Surpassing Human Performance in Offline LiDAR
based 3D Object Detection [50.959453059206446]
This paper aims for high-performance offline LiDAR-based 3D object detection.
We first observe that experienced human annotators annotate objects from a track-centric perspective.
We propose a high-performance offline detector in a track-centric perspective instead of the conventional object-centric perspective.
arXiv Detail & Related papers (2023-04-24T17:59:05Z) - Weakly Supervised Monocular 3D Object Detection using Multi-View
Projection and Direction Consistency [78.76508318592552]
Monocular 3D object detection has become a mainstream approach in automatic driving for its easy application.
Most current methods still rely on 3D point cloud data for labeling the ground truths used in the training phase.
We propose a new weakly supervised monocular 3D objection detection method, which can train the model with only 2D labels marked on images.
arXiv Detail & Related papers (2023-03-15T15:14:00Z) - Fiducial Tag Localization on a 3D LiDAR Prior Map [0.6554326244334868]
The existing LiDAR fiducial tag localization methods do not apply to 3D LiDAR maps.
We develop a novel approach to directly localize fiducial tags on a 3D LiDAR prior map.
We conduct both qualitative and quantitative experiments to demonstrate that our approach is the first method applicable to localize tags on a 3D LiDAR map.
arXiv Detail & Related papers (2022-09-02T14:07:25Z) - Label-Guided Auxiliary Training Improves 3D Object Detector [32.96310946612949]
We propose a Label-Guided auxiliary training method for 3D object detection (LG3D)
Our proposed LG3D improves VoteNet by 2.5% and 3.1% mAP on the SUN RGB-D and ScanNetV2 datasets.
arXiv Detail & Related papers (2022-07-24T14:22:21Z) - Self-Supervised Person Detection in 2D Range Data using a Calibrated
Camera [83.31666463259849]
We propose a method to automatically generate training labels (called pseudo-labels) for 2D LiDAR-based person detectors.
We show that self-supervised detectors, trained or fine-tuned with pseudo-labels, outperform detectors trained using manual annotations.
Our method is an effective way to improve person detectors during deployment without any additional labeling effort.
arXiv Detail & Related papers (2020-12-16T12:10:04Z) - Unsupervised Object Detection with LiDAR Clues [70.73881791310495]
We present the first practical method for unsupervised object detection with the aid of LiDAR clues.
In our approach, candidate object segments based on 3D point clouds are firstly generated.
Then, an iterative segment labeling process is conducted to assign segment labels and to train a segment labeling network.
The labeling process is carefully designed so as to mitigate the issue of long-tailed and open-ended distribution.
arXiv Detail & Related papers (2020-11-25T18:59:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.