Related papers: Vision-Language Guidance for LiDAR-based Unsupervised 3D Object Detection

Vision-Language Guidance for LiDAR-based Unsupervised 3D Object Detection

URL: http://arxiv.org/abs/2408.03790v1
Date: Wed, 7 Aug 2024 14:14:53 GMT
Title: Vision-Language Guidance for LiDAR-based Unsupervised 3D Object Detection
Authors: Christian Fruhwirth-Reisinger, Wei Lin, Dušan Malić, Horst Bischof, Horst Possegger,
Abstract summary: We propose an unsupervised 3D detection approach that operates exclusively on LiDAR point clouds. We exploit the inherent CLI-temporal knowledge of LiDAR point clouds for clustering, tracking, as well as boxtext and label refinement. Our approach outperforms state-of-the-art unsupervised 3D object detectors on the Open dataset.
Score: 16.09503890891102
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Accurate 3D object detection in LiDAR point clouds is crucial for autonomous driving systems. To achieve state-of-the-art performance, the supervised training of detectors requires large amounts of human-annotated data, which is expensive to obtain and restricted to predefined object categories. To mitigate manual labeling efforts, recent unsupervised object detection approaches generate class-agnostic pseudo-labels for moving objects, subsequently serving as supervision signal to bootstrap a detector. Despite promising results, these approaches do not provide class labels or generalize well to static objects. Furthermore, they are mostly restricted to data containing multiple drives from the same scene or images from a precisely calibrated and synchronized camera setup. To overcome these limitations, we propose a vision-language-guided unsupervised 3D detection approach that operates exclusively on LiDAR point clouds. We transfer CLIP knowledge to classify point clusters of static and moving objects, which we discover by exploiting the inherent spatio-temporal information of LiDAR point clouds for clustering, tracking, as well as box and label refinement. Our approach outperforms state-of-the-art unsupervised 3D object detectors on the Waymo Open Dataset ($+23~\text{AP}_{3D}$) and Argoverse 2 ($+7.9~\text{AP}_{3D}$) and provides class labels not solely based on object size assumptions, marking a significant advancement in the field.

Related papers

STONE: A Submodular Optimization Framework for Active 3D Object Detection [20.54906045954377]
Key requirement for training an accurate 3D object detector is the availability of a large amount of LiDAR-based point cloud data. This paper proposes a unified active 3D object detection framework, for greatly reducing the labeling cost of training 3D object detectors.
arXiv Detail & Related papers (2024-10-04T20:45:33Z)
SeSame: Simple, Easy 3D Object Detection with Point-Wise Semantics [0.7373617024876725]
In autonomous driving, 3D object detection provides more precise information for downstream tasks, including path planning and motion estimation. We propose SeSame: a method aimed at enhancing semantic information in existing LiDAR-only based 3D object detection. Experiments demonstrate the effectiveness of our method with performance improvements on the KITTI object detection benchmark.
arXiv Detail & Related papers (2024-03-11T08:17:56Z)
SeMoLi: What Moves Together Belongs Together [51.72754014130369]
We tackle semi-supervised object detection based on motion cues. Recent results suggest that motion-based clustering methods can be used to pseudo-label instances of moving objects. We re-think this approach and suggest that both, object detection, as well as motion-inspired pseudo-labeling, can be tackled in a data-driven manner.
arXiv Detail & Related papers (2024-02-29T18:54:53Z)
PatchContrast: Self-Supervised Pre-training for 3D Object Detection [14.603858163158625]
We introduce PatchContrast, a novel self-supervised point cloud pre-training framework for 3D object detection. We show that our method outperforms existing state-of-the-art models on three commonly-used 3D detection datasets.
arXiv Detail & Related papers (2023-08-14T07:45:54Z)
Once Detected, Never Lost: Surpassing Human Performance in Offline LiDAR based 3D Object Detection [50.959453059206446]
This paper aims for high-performance offline LiDAR-based 3D object detection. We first observe that experienced human annotators annotate objects from a track-centric perspective. We propose a high-performance offline detector in a track-centric perspective instead of the conventional object-centric perspective.
arXiv Detail & Related papers (2023-04-24T17:59:05Z)
A Lightweight and Detector-free 3D Single Object Tracker on Point Clouds [50.54083964183614]
It is non-trivial to perform accurate target-specific detection since the point cloud of objects in raw LiDAR scans is usually sparse and incomplete. We propose DMT, a Detector-free Motion prediction based 3D Tracking network that totally removes the usage of complicated 3D detectors.
arXiv Detail & Related papers (2022-03-08T17:49:07Z)
ST3D++: Denoised Self-training for Unsupervised Domain Adaptation on 3D Object Detection [78.71826145162092]
We present a self-training method, named ST3D++, with a holistic pseudo label denoising pipeline for unsupervised domain adaptation on 3D object detection. We equip the pseudo label generation process with a hybrid quality-aware triplet memory to improve the quality and stability of generated pseudo labels. In the model training stage, we propose a source data assisted training strategy and a curriculum data augmentation policy.
arXiv Detail & Related papers (2021-08-15T07:49:06Z)
3D Spatial Recognition without Spatially Labeled 3D [127.6254240158249]
We introduce WyPR, a Weakly-supervised framework for Point cloud Recognition. We show that WyPR can detect and segment objects in point cloud data without access to any spatial labels at training time.
arXiv Detail & Related papers (2021-05-13T17:58:07Z)
Unsupervised Object Detection with LiDAR Clues [70.73881791310495]
We present the first practical method for unsupervised object detection with the aid of LiDAR clues. In our approach, candidate object segments based on 3D point clouds are firstly generated. Then, an iterative segment labeling process is conducted to assign segment labels and to train a segment labeling network. The labeling process is carefully designed so as to mitigate the issue of long-tailed and open-ended distribution.
arXiv Detail & Related papers (2020-11-25T18:59:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.