3D Object Tracking with Transformer
- URL: http://arxiv.org/abs/2110.14921v1
- Date: Thu, 28 Oct 2021 07:03:19 GMT
- Title: 3D Object Tracking with Transformer
- Authors: Yubo Cui, Zheng Fang, Jiayao Shan, Zuoxu Gu, Sifan Zhou
- Abstract summary: Feature fusion could make similarity computing more efficient by including target object information.
Most existing LiDAR-based approaches directly use the extracted point cloud feature to compute similarity.
In this paper, we propose a feature fusion network based on transformer architecture.
- Score: 6.848996369226086
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Feature fusion and similarity computation are two core problems in 3D object
tracking, especially for object tracking using sparse and disordered point
clouds. Feature fusion could make similarity computing more efficient by
including target object information. However, most existing LiDAR-based
approaches directly use the extracted point cloud feature to compute similarity
while ignoring the attention changes of object regions during tracking. In this
paper, we propose a feature fusion network based on transformer architecture.
Benefiting from the self-attention mechanism, the transformer encoder captures
the inter- and intra- relations among different regions of the point cloud. By
using cross-attention, the transformer decoder fuses features and includes more
target cues into the current point cloud feature to compute the region
attentions, which makes the similarity computing more efficient. Based on this
feature fusion network, we propose an end-to-end point cloud object tracking
framework, a simple yet effective method for 3D object tracking using point
clouds. Comprehensive experimental results on the KITTI dataset show that our
method achieves new state-of-the-art performance. Code is available at:
https://github.com/3bobo/lttr.
Related papers
- PTT: Point-Trajectory Transformer for Efficient Temporal 3D Object Detection [66.94819989912823]
We propose a point-trajectory transformer with long short-term memory for efficient temporal 3D object detection.
We use point clouds of current-frame objects and their historical trajectories as input to minimize the memory bank storage requirement.
We conduct extensive experiments on the large-scale dataset to demonstrate that our approach performs well against state-of-the-art methods.
arXiv Detail & Related papers (2023-12-13T18:59:13Z) - STTracker: Spatio-Temporal Tracker for 3D Single Object Tracking [11.901758708579642]
3D single object tracking with point clouds is a critical task in 3D computer vision.
Previous methods usually input the last two frames and use the template point cloud in previous frame and the search area point cloud in the current frame respectively.
arXiv Detail & Related papers (2023-06-30T07:25:11Z) - Point Cloud Classification Using Content-based Transformer via
Clustering in Feature Space [25.57569871876213]
We propose a point content-based Transformer architecture, called PointConT for short.
It exploits the locality of points in the feature space (content-based), which clusters the sampled points with similar features into the same class and computes the self-attention within each class.
We also introduce an Inception feature aggregator for point cloud classification, which uses parallel structures to aggregate high-frequency and low-frequency information in each branch separately.
arXiv Detail & Related papers (2023-03-08T14:11:05Z) - CXTrack: Improving 3D Point Cloud Tracking with Contextual Information [59.55870742072618]
3D single object tracking plays an essential role in many applications, such as autonomous driving.
We propose CXTrack, a novel transformer-based network for 3D object tracking.
We show that CXTrack achieves state-of-the-art tracking performance while running at 29 FPS.
arXiv Detail & Related papers (2022-11-12T11:29:01Z) - 3DMODT: Attention-Guided Affinities for Joint Detection & Tracking in 3D
Point Clouds [95.54285993019843]
We propose a method for joint detection and tracking of multiple objects in 3D point clouds.
Our model exploits temporal information employing multiple frames to detect objects and track them in a single network.
arXiv Detail & Related papers (2022-11-01T20:59:38Z) - Exploiting More Information in Sparse Point Cloud for 3D Single Object
Tracking [9.693724357115762]
3D single object tracking is a key task in 3D computer vision.
The sparsity of point clouds makes it difficult to compute the similarity and locate the object.
We propose a sparse-to-dense and transformer-based framework for 3D single object tracking.
arXiv Detail & Related papers (2022-10-02T13:38:30Z) - AGO-Net: Association-Guided 3D Point Cloud Object Detection Network [86.10213302724085]
We propose a novel 3D detection framework that associates intact features for objects via domain adaptation.
We achieve new state-of-the-art performance on the KITTI 3D detection benchmark in both accuracy and speed.
arXiv Detail & Related papers (2022-08-24T16:54:38Z) - TransPillars: Coarse-to-Fine Aggregation for Multi-Frame 3D Object
Detection [47.941714033657675]
3D object detection using point clouds has attracted increasing attention due to its wide applications in autonomous driving and robotics.
We design TransPillars, a novel transformer-based feature aggregation technique that exploits temporal features of consecutive point cloud frames.
Our proposed TransPillars achieves state-of-art performance as compared to existing multi-frame detection approaches.
arXiv Detail & Related papers (2022-08-04T15:41:43Z) - CloudAttention: Efficient Multi-Scale Attention Scheme For 3D Point
Cloud Learning [81.85951026033787]
We set transformers in this work and incorporate them into a hierarchical framework for shape classification and part and scene segmentation.
We also compute efficient and dynamic global cross attentions by leveraging sampling and grouping at each iteration.
The proposed hierarchical model achieves state-of-the-art shape classification in mean accuracy and yields results on par with the previous segmentation methods.
arXiv Detail & Related papers (2022-07-31T21:39:15Z) - 3D-SiamRPN: An End-to-End Learning Method for Real-Time 3D Single Object
Tracking Using Raw Point Cloud [9.513194898261787]
We propose a 3D tracking method called 3D-SiamRPN Network to track a single target object by using raw 3D point cloud data.
Experimental results on KITTI dataset show that our method has a competitive performance in both Success and Precision.
arXiv Detail & Related papers (2021-08-12T09:52:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.