ProposalContrast: Unsupervised Pre-training for LiDAR-based 3D Object
Detection
- URL: http://arxiv.org/abs/2207.12654v1
- Date: Tue, 26 Jul 2022 04:45:49 GMT
- Title: ProposalContrast: Unsupervised Pre-training for LiDAR-based 3D Object
Detection
- Authors: Junbo Yin, Dingfu Zhou, Liangjun Zhang, Jin Fang, Cheng-Zhong Xu,
Jianbing Shen, and Wenguan Wang
- Abstract summary: ProposalContrast is an unsupervised point cloud pre-training framework.
It learns robust 3D representations by contrasting region proposals.
ProposalContrast is verified on various 3D detectors.
- Score: 114.54835359657707
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing approaches for unsupervised point cloud pre-training are constrained
to either scene-level or point/voxel-level instance discrimination. Scene-level
methods tend to lose local details that are crucial for recognizing the road
objects, while point/voxel-level methods inherently suffer from limited
receptive field that is incapable of perceiving large objects or context
environments. Considering region-level representations are more suitable for 3D
object detection, we devise a new unsupervised point cloud pre-training
framework, called ProposalContrast, that learns robust 3D representations by
contrasting region proposals. Specifically, with an exhaustive set of region
proposals sampled from each point cloud, geometric point relations within each
proposal are modeled for creating expressive proposal representations. To
better accommodate 3D detection properties, ProposalContrast optimizes with
both inter-cluster and inter-proposal separation, i.e., sharpening the
discriminativeness of proposal representations across semantic classes and
object instances. The generalizability and transferability of ProposalContrast
are verified on various 3D detectors (i.e., PV-RCNN, CenterPoint, PointPillars
and PointRCNN) and datasets (i.e., KITTI, Waymo and ONCE).
Related papers
- Weakly Supervised Open-Vocabulary Object Detection [31.605276665964787]
We propose a novel weakly supervised open-vocabulary object detection framework, namely WSOVOD, to extend traditional WSOD.
To achieve this, we explore three vital strategies, including dataset-level feature adaptation, image-level salient object localization, and region-level vision-language alignment.
arXiv Detail & Related papers (2023-12-19T18:59:53Z) - PatchContrast: Self-Supervised Pre-training for 3D Object Detection [14.603858163158625]
We introduce PatchContrast, a novel self-supervised point cloud pre-training framework for 3D object detection.
We show that our method outperforms existing state-of-the-art models on three commonly-used 3D detection datasets.
arXiv Detail & Related papers (2023-08-14T07:45:54Z) - Open-Vocabulary Point-Cloud Object Detection without 3D Annotation [62.18197846270103]
The goal of open-vocabulary 3D point-cloud detection is to identify novel objects based on arbitrary textual descriptions.
We develop a point-cloud detector that can learn a general representation for localizing various objects.
We also propose a novel de-biased triplet cross-modal contrastive learning to connect the modalities of image, point-cloud and text.
arXiv Detail & Related papers (2023-04-03T08:22:02Z) - Exploring Active 3D Object Detection from a Generalization Perspective [58.597942380989245]
Uncertainty-based active learning policies fail to balance the trade-off between point cloud informativeness and box-level annotation costs.
We propose textscCrb, which hierarchically filters out the point clouds of redundant 3D bounding box labels.
Experiments show that the proposed approach outperforms existing active learning strategies.
arXiv Detail & Related papers (2023-01-23T02:43:03Z) - 3D-SPS: Single-Stage 3D Visual Grounding via Referred Point Progressive
Selection [35.5386998382886]
3D visual grounding aims to locate the referred target object in 3D point cloud scenes according to a free-form language description.
Previous methods mostly follow a two-stage paradigm, i.e., language-irrelevant detection and cross-modal matching.
We propose a 3D Single-Stage Referred Point Progressive Selection method, which progressively selects keypoints with the guidance of language and directly locates the target.
arXiv Detail & Related papers (2022-04-13T09:46:27Z) - ImpDet: Exploring Implicit Fields for 3D Object Detection [74.63774221984725]
We introduce a new perspective that views bounding box regression as an implicit function.
This leads to our proposed framework, termed Implicit Detection or ImpDet.
Our ImpDet assigns specific values to points in different local 3D spaces, thereby high-quality boundaries can be generated.
arXiv Detail & Related papers (2022-03-31T17:52:12Z) - SASA: Semantics-Augmented Set Abstraction for Point-based 3D Object
Detection [78.90102636266276]
We propose a novel set abstraction method named Semantics-Augmented Set Abstraction (SASA)
Based on the estimated point-wise foreground scores, we then propose a semantics-guided point sampling algorithm to help retain more important foreground points during down-sampling.
In practice, SASA shows to be effective in identifying valuable points related to foreground objects and improving feature learning for point-based 3D detection.
arXiv Detail & Related papers (2022-01-06T08:54:47Z) - Oriented RepPoints for Aerial Object Detection [10.818838437018682]
In this paper, we propose a novel approach to aerial object detection, named Oriented RepPoints.
Specifically, we suggest to employ a set of adaptive points to capture the geometric and spatial information of the arbitrary-oriented objects.
To facilitate the supervised learning, the oriented conversion function is proposed to explicitly map the adaptive point set into an oriented bounding box.
arXiv Detail & Related papers (2021-05-24T06:18:23Z) - 3D Spatial Recognition without Spatially Labeled 3D [127.6254240158249]
We introduce WyPR, a Weakly-supervised framework for Point cloud Recognition.
We show that WyPR can detect and segment objects in point cloud data without access to any spatial labels at training time.
arXiv Detail & Related papers (2021-05-13T17:58:07Z) - 3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segmentation [26.169985423367393]
3D-MPA is a method for instance segmentation on 3D point clouds.
We learn proposal features from grouped point features that voted for the same object center.
A graph convolutional network introduces inter-proposal relations.
arXiv Detail & Related papers (2020-03-30T23:28:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.