MBPTrack: Improving 3D Point Cloud Tracking with Memory Networks and Box
Priors
- URL: http://arxiv.org/abs/2303.05071v1
- Date: Thu, 9 Mar 2023 07:07:39 GMT
- Title: MBPTrack: Improving 3D Point Cloud Tracking with Memory Networks and Box
Priors
- Authors: Tian-Xing Xu, Yuan-Chen Guo, Yu-Kun Lai, Song-Hai Zhang
- Abstract summary: 3D single object tracking has been a crucial problem for decades with numerous applications such as autonomous driving.
We present MBPTrack, which adopts a Memory mechanism to utilize past information and formulates localization in a coarse-to-fine scheme.
- Score: 59.55870742072618
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 3D single object tracking has been a crucial problem for decades with
numerous applications such as autonomous driving. Despite its wide-ranging use,
this task remains challenging due to the significant appearance variation
caused by occlusion and size differences among tracked targets. To address
these issues, we present MBPTrack, which adopts a Memory mechanism to utilize
past information and formulates localization in a coarse-to-fine scheme using
Box Priors given in the first frame. Specifically, past frames with targetness
masks serve as an external memory, and a transformer-based module propagates
tracked target cues from the memory to the current frame. To precisely localize
objects of all sizes, MBPTrack first predicts the target center via Hough
voting. By leveraging box priors given in the first frame, we adaptively sample
reference points around the target center that roughly cover the target of
different sizes. Then, we obtain dense feature maps by aggregating point
features into the reference points, where localization can be performed more
effectively. Extensive experiments demonstrate that MBPTrack achieves
state-of-the-art performance on KITTI, nuScenes and Waymo Open Dataset, while
running at 50 FPS on a single RTX3090 GPU.
Related papers
- BEVTrack: A Simple and Strong Baseline for 3D Single Object Tracking in Bird's-Eye View [56.77287041917277]
3D Single Object Tracking (SOT) is a fundamental task of computer vision, proving essential for applications like autonomous driving.
In this paper, we propose BEVTrack, a simple yet effective baseline method.
By estimating the target motion in Bird's-Eye View (BEV) to perform tracking, BEVTrack demonstrates surprising simplicity from various aspects, i.e., network designs, training objectives, and tracking pipeline, while achieving superior performance.
arXiv Detail & Related papers (2023-09-05T12:42:26Z) - Motion-to-Matching: A Mixed Paradigm for 3D Single Object Tracking [27.805298263103495]
We propose MTM-Tracker, which combines motion modeling with feature matching into a single network.
In the first stage, we exploit the continuous historical boxes as motion prior and propose an encoder-decoder structure to locate target coarsely.
In the second stage, we introduce a feature interaction module to extract motion-aware features from consecutive point clouds and match them to refine target movement.
arXiv Detail & Related papers (2023-08-23T02:40:51Z) - STTracker: Spatio-Temporal Tracker for 3D Single Object Tracking [11.901758708579642]
3D single object tracking with point clouds is a critical task in 3D computer vision.
Previous methods usually input the last two frames and use the template point cloud in previous frame and the search area point cloud in the current frame respectively.
arXiv Detail & Related papers (2023-06-30T07:25:11Z) - CXTrack: Improving 3D Point Cloud Tracking with Contextual Information [59.55870742072618]
3D single object tracking plays an essential role in many applications, such as autonomous driving.
We propose CXTrack, a novel transformer-based network for 3D object tracking.
We show that CXTrack achieves state-of-the-art tracking performance while running at 29 FPS.
arXiv Detail & Related papers (2022-11-12T11:29:01Z) - Learning Dynamic Compact Memory Embedding for Deformable Visual Object
Tracking [82.34356879078955]
We propose a compact memory embedding to enhance the discrimination of the segmentation-based deformable visual tracking method.
Our method outperforms the excellent segmentation-based trackers, i.e., D3S and SiamMask on DAVIS 2017 benchmark.
arXiv Detail & Related papers (2021-11-23T03:07:12Z) - Single Object Tracking through a Fast and Effective Single-Multiple
Model Convolutional Neural Network [0.0]
Recent state-of-the-art (SOTA) approaches are proposed based on taking a matching network with a heavy structure to distinguish the target from other objects in the area.
In this article, a special architecture is proposed based on which in contrast to the previous approaches, it is possible to identify the object location in a single shot.
The presented tracker performs comparatively with the SOTA in challenging situations while having a super speed compared to them (up to $120 FPS$ on 1080ti)
arXiv Detail & Related papers (2021-03-28T11:02:14Z) - Learning Spatio-Appearance Memory Network for High-Performance Visual
Tracking [79.80401607146987]
Existing object tracking usually learns a bounding-box based template to match visual targets across frames, which cannot accurately learn a pixel-wise representation.
This paper presents a novel segmentation-based tracking architecture, which is equipped with a local-temporal memory network to learn accurate-temporal correspondence.
arXiv Detail & Related papers (2020-09-21T08:12:02Z) - Towards Accurate Pixel-wise Object Tracking by Attention Retrieval [50.06436600343181]
We propose an attention retrieval network (ARN) to perform soft spatial constraints on backbone features.
We set a new state-of-the-art on recent pixel-wise object tracking benchmark VOT 2020 while running at 40 fps.
arXiv Detail & Related papers (2020-08-06T16:25:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.