LabelFormer: Object Trajectory Refinement for Offboard Perception from
LiDAR Point Clouds
- URL: http://arxiv.org/abs/2311.01444v1
- Date: Thu, 2 Nov 2023 17:56:06 GMT
- Title: LabelFormer: Object Trajectory Refinement for Offboard Perception from
LiDAR Point Clouds
- Authors: Anqi Joyce Yang, Sergio Casas, Nikita Dvornik, Sean Segal, Yuwen
Xiong, Jordan Sir Kwang Hu, Carter Fang, Raquel Urtasun
- Abstract summary: "Auto-labelling" offboard perception models are trained to automatically generate annotations from raw LiDAR point clouds.
We propose LabelFormer, a simple, efficient, and effective trajectory-level refinement approach.
Our approach first encodes each frame's observations separately, then exploits self-attention to reason about the trajectory with full temporal context.
- Score: 37.87496475959941
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A major bottleneck to scaling-up training of self-driving perception systems
are the human annotations required for supervision. A promising alternative is
to leverage "auto-labelling" offboard perception models that are trained to
automatically generate annotations from raw LiDAR point clouds at a fraction of
the cost. Auto-labels are most commonly generated via a two-stage approach --
first objects are detected and tracked over time, and then each object
trajectory is passed to a learned refinement model to improve accuracy. Since
existing refinement models are overly complex and lack advanced temporal
reasoning capabilities, in this work we propose LabelFormer, a simple,
efficient, and effective trajectory-level refinement approach. Our approach
first encodes each frame's observations separately, then exploits
self-attention to reason about the trajectory with full temporal context, and
finally decodes the refined object size and per-frame poses. Evaluation on both
urban and highway datasets demonstrates that LabelFormer outperforms existing
works by a large margin. Finally, we show that training on a dataset augmented
with auto-labels generated by our method leads to improved downstream detection
performance compared to existing methods. Please visit the project website for
details https://waabi.ai/labelformer
Related papers
- TrajSSL: Trajectory-Enhanced Semi-Supervised 3D Object Detection [59.498894868956306]
Pseudo-labeling approaches to semi-supervised learning adopt a teacher-student framework.
We leverage pre-trained motion-forecasting models to generate object trajectories on pseudo-labeled data.
Our approach improves pseudo-label quality in two distinct manners.
arXiv Detail & Related papers (2024-09-17T05:35:00Z) - Boosting Gesture Recognition with an Automatic Gesture Annotation Framework [10.158684480548242]
We propose a framework that can automatically annotate gesture classes and identify their temporal ranges.
Our framework consists of two key components: (1) a novel annotation model that leverages the Connectionist Temporal Classification (CTC) loss, and (2) a semi-supervised learning pipeline.
These high-quality pseudo labels can also be used to enhance the accuracy of other downstream gesture recognition models.
arXiv Detail & Related papers (2024-01-20T07:11:03Z) - Once Detected, Never Lost: Surpassing Human Performance in Offline LiDAR
based 3D Object Detection [50.959453059206446]
This paper aims for high-performance offline LiDAR-based 3D object detection.
We first observe that experienced human annotators annotate objects from a track-centric perspective.
We propose a high-performance offline detector in a track-centric perspective instead of the conventional object-centric perspective.
arXiv Detail & Related papers (2023-04-24T17:59:05Z) - LESS: Label-Efficient Semantic Segmentation for LiDAR Point Clouds [62.49198183539889]
We propose a label-efficient semantic segmentation pipeline for outdoor scenes with LiDAR point clouds.
Our method co-designs an efficient labeling process with semi/weakly supervised learning.
Our proposed method is even highly competitive compared to the fully supervised counterpart with 100% labels.
arXiv Detail & Related papers (2022-10-14T19:13:36Z) - MAPLE: Masked Pseudo-Labeling autoEncoder for Semi-supervised Point
Cloud Action Recognition [160.49403075559158]
We propose a Masked Pseudo-Labeling autoEncoder (textbfMAPLE) framework for point cloud action recognition.
In particular, we design a novel and efficient textbfDecoupled textbfspatial-textbftemporal TranstextbfFormer (textbfDestFormer) as the backbone of MAPLE.
MAPLE achieves superior results on three public benchmarks and outperforms the state-of-the-art method by 8.08% accuracy on the MSR-Action3
arXiv Detail & Related papers (2022-09-01T12:32:40Z) - Weakly-Supervised Salient Object Detection Using Point Supervison [17.88596733603456]
Current state-of-the-art saliency detection models rely heavily on large datasets of accurate pixel-wise annotations.
We propose a novel weakly-supervised salient object detection method using point supervision.
Our method outperforms the previous state-of-the-art methods trained with the stronger supervision.
arXiv Detail & Related papers (2022-03-22T12:16:05Z) - Towards Good Practices for Efficiently Annotating Large-Scale Image
Classification Datasets [90.61266099147053]
We investigate efficient annotation strategies for collecting multi-class classification labels for a large collection of images.
We propose modifications and best practices aimed at minimizing human labeling effort.
Simulated experiments on a 125k image subset of the ImageNet100 show that it can be annotated to 80% top-1 accuracy with 0.35 annotations per image on average.
arXiv Detail & Related papers (2021-04-26T16:29:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.