SL3D: Self-supervised-Self-labeled 3D Recognition
- URL: http://arxiv.org/abs/2210.16810v2
- Date: Thu, 3 Nov 2022 07:41:51 GMT
- Title: SL3D: Self-supervised-Self-labeled 3D Recognition
- Authors: Fernando Julio Cendra, Lan Ma, Jiajun Shen, Xiaojuan Qi
- Abstract summary: We propose a Self-supervised-Self-Labeled 3D Recognition (SL3D) framework.
SL3D simultaneously solves two coupled objectives, i.e., clustering and learning feature representation.
It can be applied to solve different 3D recognition tasks, including classification, object detection, and semantic segmentation.
- Score: 89.19932178712065
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: There are a lot of promising results in 3D recognition, including
classification, object detection, and semantic segmentation. However, many of
these results rely on manually collecting densely annotated real-world 3D data,
which is highly time-consuming and expensive to obtain, limiting the
scalability of 3D recognition tasks. Thus in this paper, we study unsupervised
3D recognition and propose a Self-supervised-Self-Labeled 3D Recognition (SL3D)
framework. SL3D simultaneously solves two coupled objectives, i.e., clustering
and learning feature representation to generate pseudo labeled data for
unsupervised 3D recognition. SL3D is a generic framework and can be applied to
solve different 3D recognition tasks, including classification, object
detection, and semantic segmentation. Extensive experiments demonstrate its
effectiveness. Code is available at https://github.com/fcendra/sl3d.
Related papers
- Learning 3D Representations from Procedural 3D Programs [6.915871213703219]
Self-supervised learning has emerged as a promising approach for acquiring transferable 3D representations from unlabeled 3D point clouds.
We propose learning 3D representations from procedural 3D programs that automatically generate 3D shapes using simple primitives and augmentations.
arXiv Detail & Related papers (2024-11-25T18:59:57Z) - Weakly Supervised 3D Object Detection via Multi-Level Visual Guidance [72.6809373191638]
We propose a framework to study how to leverage constraints between 2D and 3D domains without requiring any 3D labels.
Specifically, we design a feature-level constraint to align LiDAR and image features based on object-aware regions.
Second, the output-level constraint is developed to enforce the overlap between 2D and projected 3D box estimations.
Third, the training-level constraint is utilized by producing accurate and consistent 3D pseudo-labels that align with the visual data.
arXiv Detail & Related papers (2023-12-12T18:57:25Z) - SOGDet: Semantic-Occupancy Guided Multi-view 3D Object Detection [19.75965521357068]
We propose a novel approach called SOGDet (Semantic-Occupancy Guided Multi-view 3D Object Detection) to improve the accuracy of 3D object detection.
Our results show that SOGDet consistently enhance the performance of three baseline methods in terms of nuScenes Detection Score (NDS) and mean Average Precision (mAP)
This indicates that the combination of 3D object detection and 3D semantic occupancy leads to a more comprehensive perception of the 3D environment, thereby aiding build more robust autonomous driving systems.
arXiv Detail & Related papers (2023-08-26T07:38:21Z) - CMR3D: Contextualized Multi-Stage Refinement for 3D Object Detection [57.44434974289945]
We propose Contextualized Multi-Stage Refinement for 3D Object Detection (CMR3D) framework.
Our framework takes a 3D scene as input and strives to explicitly integrate useful contextual information of the scene.
In addition to 3D object detection, we investigate the effectiveness of our framework for the problem of 3D object counting.
arXiv Detail & Related papers (2022-09-13T05:26:09Z) - Gait Recognition in the Wild with Dense 3D Representations and A
Benchmark [86.68648536257588]
Existing studies for gait recognition are dominated by 2D representations like the silhouette or skeleton of the human body in constrained scenes.
This paper aims to explore dense 3D representations for gait recognition in the wild.
We build the first large-scale 3D representation-based gait recognition dataset, named Gait3D.
arXiv Detail & Related papers (2022-04-06T03:54:06Z) - 3D Spatial Recognition without Spatially Labeled 3D [127.6254240158249]
We introduce WyPR, a Weakly-supervised framework for Point cloud Recognition.
We show that WyPR can detect and segment objects in point cloud data without access to any spatial labels at training time.
arXiv Detail & Related papers (2021-05-13T17:58:07Z) - Seeing by haptic glance: reinforcement learning-based 3D object
Recognition [31.80213713136647]
Human is able to conduct 3D recognition by a limited number of haptic contacts between the target object and his/her fingers without seeing the object.
This capability is defined as haptic glance' in cognitive neuroscience.
Most of the existing 3D recognition models were developed based on dense 3D data.
In many real-life use cases, where robots are used to collect 3D data by haptic exploration, only a limited number of 3D points could be collected.
A novel reinforcement learning based framework is proposed, where the haptic exploration procedure is optimized simultaneously with the objective 3D recognition with actively collected 3D
arXiv Detail & Related papers (2021-02-15T15:38:22Z) - Learning Monocular 3D Vehicle Detection without 3D Bounding Box Labels [0.09558392439655011]
Training of 3D object detectors requires datasets with 3D bounding box labels for supervision that have to be generated by hand-labeling.
We propose a network architecture and training procedure for learning monocular 3D object detection without 3D bounding box labels.
We evaluate the proposed algorithm on the real-world KITTI dataset and achieve promising performance in comparison to state-of-the-art methods requiring 3D bounding box labels for training.
arXiv Detail & Related papers (2020-10-07T16:24:46Z) - SESS: Self-Ensembling Semi-Supervised 3D Object Detection [138.80825169240302]
We propose SESS, a self-ensembling semi-supervised 3D object detection framework. Specifically, we design a thorough perturbation scheme to enhance generalization of the network on unlabeled and new unseen data.
Our SESS achieves competitive performance compared to the state-of-the-art fully-supervised method by using only 50% labeled data.
arXiv Detail & Related papers (2019-12-26T08:48:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.