Multi-Frame to Single-Frame: Knowledge Distillation for 3D Object
Detection
- URL: http://arxiv.org/abs/2009.11859v1
- Date: Thu, 24 Sep 2020 17:59:12 GMT
- Title: Multi-Frame to Single-Frame: Knowledge Distillation for 3D Object
Detection
- Authors: Yue Wang and Alireza Fathi and Jiajun Wu and Thomas Funkhouser and
Justin Solomon
- Abstract summary: We use knowledge distillation to bridge the gap between a model trained on high-quality inputs at training time and another tested on low-quality inputs at inference time.
First, we train an object detection model on dense point clouds, which are generated from multiple frames using extra information only available at training time.
Then, we train the model's identical counterpart on sparse single-frame point clouds with consistency regularization on features from both models.
- Score: 36.238956089801825
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A common dilemma in 3D object detection for autonomous driving is that
high-quality, dense point clouds are only available during training, but not
testing. We use knowledge distillation to bridge the gap between a model
trained on high-quality inputs at training time and another tested on
low-quality inputs at inference time. In particular, we design a two-stage
training pipeline for point cloud object detection. First, we train an object
detection model on dense point clouds, which are generated from multiple frames
using extra information only available at training time. Then, we train the
model's identical counterpart on sparse single-frame point clouds with
consistency regularization on features from both models. We show that this
procedure improves performance on low-quality data during testing, without
additional overhead.
Related papers
- Cross-Modal Self-Supervised Learning with Effective Contrastive Units for LiDAR Point Clouds [34.99995524090838]
3D perception in LiDAR point clouds is crucial for a self-driving vehicle to properly act in 3D environment.
There has been a growing interest in self-supervised pre-training of 3D perception models.
We propose the instance-aware and similarity-balanced contrastive units that are tailored for self-driving point clouds.
arXiv Detail & Related papers (2024-09-10T19:11:45Z) - Adapt PointFormer: 3D Point Cloud Analysis via Adapting 2D Visual Transformers [38.08724410736292]
This paper attempts to leverage pre-trained models with 2D prior knowledge to accomplish the tasks for 3D point cloud analysis.
We propose the Adaptive PointFormer (APF), which fine-tunes pre-trained 2D models with only a modest number of parameters to directly process point clouds.
arXiv Detail & Related papers (2024-07-18T06:32:45Z) - PatchContrast: Self-Supervised Pre-training for 3D Object Detection [14.603858163158625]
We introduce PatchContrast, a novel self-supervised point cloud pre-training framework for 3D object detection.
We show that our method outperforms existing state-of-the-art models on three commonly-used 3D detection datasets.
arXiv Detail & Related papers (2023-08-14T07:45:54Z) - Weakly Supervised Monocular 3D Object Detection using Multi-View
Projection and Direction Consistency [78.76508318592552]
Monocular 3D object detection has become a mainstream approach in automatic driving for its easy application.
Most current methods still rely on 3D point cloud data for labeling the ground truths used in the training phase.
We propose a new weakly supervised monocular 3D objection detection method, which can train the model with only 2D labels marked on images.
arXiv Detail & Related papers (2023-03-15T15:14:00Z) - Generalized Few-Shot 3D Object Detection of LiDAR Point Cloud for
Autonomous Driving [91.39625612027386]
We propose a novel task, called generalized few-shot 3D object detection, where we have a large amount of training data for common (base) objects, but only a few data for rare (novel) classes.
Specifically, we analyze in-depth differences between images and point clouds, and then present a practical principle for the few-shot setting in the 3D LiDAR dataset.
To solve this task, we propose an incremental fine-tuning method to extend existing 3D detection models to recognize both common and rare objects.
arXiv Detail & Related papers (2023-02-08T07:11:36Z) - 3D Point Cloud Pre-training with Knowledge Distillation from 2D Images [128.40422211090078]
We propose a knowledge distillation method for 3D point cloud pre-trained models to acquire knowledge directly from the 2D representation learning model.
Specifically, we introduce a cross-attention mechanism to extract concept features from 3D point cloud and compare them with the semantic information from 2D images.
In this scheme, the point cloud pre-trained models learn directly from rich information contained in 2D teacher models.
arXiv Detail & Related papers (2022-12-17T23:21:04Z) - MATE: Masked Autoencoders are Online 3D Test-Time Learners [63.3907730920114]
MATE is the first Test-Time-Training (TTT) method designed for 3D data.
It makes deep networks trained for point cloud classification robust to distribution shifts occurring in test data.
arXiv Detail & Related papers (2022-11-21T13:19:08Z) - P2P: Tuning Pre-trained Image Models for Point Cloud Analysis with
Point-to-Pixel Prompting [94.11915008006483]
We propose a novel Point-to-Pixel prompting for point cloud analysis.
Our method attains 89.3% accuracy on the hardest setting of ScanObjectNN.
Our framework also exhibits very competitive performance on ModelNet classification and ShapeNet Part Code.
arXiv Detail & Related papers (2022-08-04T17:59:03Z) - Boosting Single-Frame 3D Object Detection by Simulating Multi-Frame
Point Clouds [47.488158093929904]
We present a new approach to train a detector to simulate features and responses following a detector trained on multi-frame point clouds.
Our approach needs multi-frame point clouds only when training the single-frame detector, and once trained, it can detect objects with only single-frame point clouds as inputs during the inference.
arXiv Detail & Related papers (2022-07-03T12:59:50Z) - Self-Supervised Pretraining of 3D Features on any Point-Cloud [40.26575888582241]
We present a simple self-supervised pertaining method that can work with any 3D data without 3D registration.
We evaluate our models on 9 benchmarks for object detection, semantic segmentation, and object classification, where they achieve state-of-the-art results and can outperform supervised pretraining.
arXiv Detail & Related papers (2021-01-07T18:55:21Z) - D3Feat: Joint Learning of Dense Detection and Description of 3D Local
Features [51.04841465193678]
We leverage a 3D fully convolutional network for 3D point clouds.
We propose a novel and practical learning mechanism that densely predicts both a detection score and a description feature for each 3D point.
Our method achieves state-of-the-art results in both indoor and outdoor scenarios.
arXiv Detail & Related papers (2020-03-06T12:51:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.