Spatio-Temporal Bi-directional Cross-frame Memory for Distractor Filtering Point Cloud Single Object Tracking
- URL: http://arxiv.org/abs/2403.15831v1
- Date: Sat, 23 Mar 2024 13:15:44 GMT
- Title: Spatio-Temporal Bi-directional Cross-frame Memory for Distractor Filtering Point Cloud Single Object Tracking
- Authors: Shaoyu Sun, Chunyang Wang, Xuelian Liu, Chunhao Shi, Yueyang Ding, Guan Xi,
- Abstract summary: 3 single object tracking within LIDAR point is pivotal task in computer vision.
Existing methods, which depend solely on appearance matching via networks or utilize information from successive frames, encounter significant challenges.
We design an innovative cross-frame bi-temporal motion tracker, named STMD-Tracker, to mitigate these challenges.
- Score: 2.487142846438629
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 3D single object tracking within LIDAR point clouds is a pivotal task in computer vision, with profound implications for autonomous driving and robotics. However, existing methods, which depend solely on appearance matching via Siamese networks or utilize motion information from successive frames, encounter significant challenges. Issues such as similar objects nearby or occlusions can result in tracker drift. To mitigate these challenges, we design an innovative spatio-temporal bi-directional cross-frame distractor filtering tracker, named STMD-Tracker. Our first step involves the creation of a 4D multi-frame spatio-temporal graph convolution backbone. This design separates KNN graph spatial embedding and incorporates 1D temporal convolution, effectively capturing temporal fluctuations and spatio-temporal information. Subsequently, we devise a novel bi-directional cross-frame memory procedure. This integrates future and synthetic past frame memory to enhance the current memory, thereby improving the accuracy of iteration-based tracking. This iterative memory update mechanism allows our tracker to dynamically compensate for information in the current frame, effectively reducing tracker drift. Lastly, we construct spatially reliable Gaussian masks on the fused features to eliminate distractor points. This is further supplemented by an object-aware sampling strategy, which bolsters the efficiency and precision of object localization, thereby reducing tracking errors caused by distractors. Our extensive experiments on KITTI, NuScenes and Waymo datasets demonstrate that our approach significantly surpasses the current state-of-the-art methods.
Related papers
- Future Does Matter: Boosting 3D Object Detection with Temporal Motion Estimation in Point Cloud Sequences [25.74000325019015]
We introduce a novel LiDAR 3D object detection framework, namely LiSTM, to facilitate spatial-temporal feature learning with cross-frame motion forecasting information.
We have conducted experiments on the aggregation and nuScenes datasets to demonstrate that the proposed framework achieves superior 3D detection performance.
arXiv Detail & Related papers (2024-09-06T16:29:04Z) - TASeg: Temporal Aggregation Network for LiDAR Semantic Segmentation [80.13343299606146]
We propose a Temporal LiDAR Aggregation and Distillation (TLAD) algorithm, which leverages historical priors to assign different aggregation steps for different classes.
To make full use of temporal images, we design a Temporal Image Aggregation and Fusion (TIAF) module, which can greatly expand the camera FOV.
We also develop a Static-Moving Switch Augmentation (SMSA) algorithm, which utilizes sufficient temporal information to enable objects to switch their motion states freely.
arXiv Detail & Related papers (2024-07-13T03:00:16Z) - PTT: Point-Trajectory Transformer for Efficient Temporal 3D Object Detection [66.94819989912823]
We propose a point-trajectory transformer with long short-term memory for efficient temporal 3D object detection.
We use point clouds of current-frame objects and their historical trajectories as input to minimize the memory bank storage requirement.
We conduct extensive experiments on the large-scale dataset to demonstrate that our approach performs well against state-of-the-art methods.
arXiv Detail & Related papers (2023-12-13T18:59:13Z) - TrackAgent: 6D Object Tracking via Reinforcement Learning [24.621588217873395]
We propose to simplify object tracking to a reinforced point cloud (depth only) alignment task.
This allows us to train a streamlined approach from scratch with limited amounts of sparse 3D point clouds.
We also show that the RL agent's uncertainty and a rendering-based mask propagation are effective reinitialization triggers.
arXiv Detail & Related papers (2023-07-28T17:03:00Z) - STTracker: Spatio-Temporal Tracker for 3D Single Object Tracking [11.901758708579642]
3D single object tracking with point clouds is a critical task in 3D computer vision.
Previous methods usually input the last two frames and use the template point cloud in previous frame and the search area point cloud in the current frame respectively.
arXiv Detail & Related papers (2023-06-30T07:25:11Z) - An Effective Motion-Centric Paradigm for 3D Single Object Tracking in
Point Clouds [50.19288542498838]
3D single object tracking in LiDAR point clouds (LiDAR SOT) plays a crucial role in autonomous driving.
Current approaches all follow the Siamese paradigm based on appearance matching.
We introduce a motion-centric paradigm to handle LiDAR SOT from a new perspective.
arXiv Detail & Related papers (2023-03-21T17:28:44Z) - Real-time Multi-Object Tracking Based on Bi-directional Matching [0.0]
This study offers a bi-directional matching algorithm for multi-object tracking.
A stranded area is used in the matching algorithm to temporarily store the objects that fail to be tracked.
In the MOT17 challenge, the proposed algorithm achieves 63.4% MOTA, 55.3% IDF1, and 20.1 FPS tracking speed.
arXiv Detail & Related papers (2023-03-15T08:38:08Z) - Modeling Continuous Motion for 3D Point Cloud Object Tracking [54.48716096286417]
This paper presents a novel approach that views each tracklet as a continuous stream.
At each timestamp, only the current frame is fed into the network to interact with multi-frame historical features stored in a memory bank.
To enhance the utilization of multi-frame features for robust tracking, a contrastive sequence enhancement strategy is proposed.
arXiv Detail & Related papers (2023-03-14T02:58:27Z) - CXTrack: Improving 3D Point Cloud Tracking with Contextual Information [59.55870742072618]
3D single object tracking plays an essential role in many applications, such as autonomous driving.
We propose CXTrack, a novel transformer-based network for 3D object tracking.
We show that CXTrack achieves state-of-the-art tracking performance while running at 29 FPS.
arXiv Detail & Related papers (2022-11-12T11:29:01Z) - Continuity-Discrimination Convolutional Neural Network for Visual Object
Tracking [150.51667609413312]
This paper proposes a novel model, named Continuity-Discrimination Convolutional Neural Network (CD-CNN) for visual object tracking.
To address this problem, CD-CNN models temporal appearance continuity based on the idea of temporal slowness.
In order to alleviate inaccurate target localization and drifting, we propose a novel notion, object-centroid.
arXiv Detail & Related papers (2021-04-18T06:35:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.