Point Cloud Registration-Driven Robust Feature Matching for 3D Siamese
Object Tracking
- URL: http://arxiv.org/abs/2209.06395v1
- Date: Wed, 14 Sep 2022 03:25:04 GMT
- Title: Point Cloud Registration-Driven Robust Feature Matching for 3D Siamese
Object Tracking
- Authors: Haobo Jiang, Kaihao Lan, Le Hui, Guangyu Li, Jin Xie, and Jian Yang
- Abstract summary: Learning robust feature matching between the template and search area is crucial for 3D Siamese tracking.
We propose a novel point cloud registration-driven Siamese tracking framework, with the intuition that spatially aligned corresponding points tend to achieve consistent feature representations.
Our method consists of two modules, including a tracking-specific nonlocal registration module and a registration-aided Sinkhorn template-feature aggregation module.
- Score: 24.97192595209272
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning robust feature matching between the template and search area is
crucial for 3D Siamese tracking. The core of Siamese feature matching is how to
assign high feature similarity on the corresponding points between the template
and search area for precise object localization. In this paper, we propose a
novel point cloud registration-driven Siamese tracking framework, with the
intuition that spatially aligned corresponding points (via 3D registration)
tend to achieve consistent feature representations. Specifically, our method
consists of two modules, including a tracking-specific nonlocal registration
module and a registration-aided Sinkhorn template-feature aggregation module.
The registration module targets at the precise spatial alignment between the
template and search area. The tracking-specific spatial distance constraint is
proposed to refine the cross-attention weights in the nonlocal module for
discriminative feature learning. Then, we use the weighted SVD to compute the
rigid transformation between the template and search area, and align them to
achieve the desired spatially aligned corresponding points. For the feature
aggregation model, we formulate the feature matching between the transformed
template and search area as an optimal transport problem and utilize the
Sinkhorn optimization to search for the outlier-robust matching solution. Also,
a registration-aided spatial distance map is built to improve the matching
robustness in indistinguishable regions (e.g., smooth surface). Finally, guided
by the obtained feature matching map, we aggregate the target information from
the template into the search area to construct the target-specific feature,
which is then fed into a CenterPoint-like detection head for object
localization. Extensive experiments on KITTI, NuScenes and Waymo datasets
verify the effectiveness of our proposed method.
Related papers
- CL3D: Unsupervised Domain Adaptation for Cross-LiDAR 3D Detection [16.021932740447966]
Domain adaptation for Cross-LiDAR 3D detection is challenging due to the large gap on the raw data representation.
We present an unsupervised domain adaptation method that overcomes above difficulties.
arXiv Detail & Related papers (2022-12-01T03:22:55Z) - 3DMODT: Attention-Guided Affinities for Joint Detection & Tracking in 3D
Point Clouds [95.54285993019843]
We propose a method for joint detection and tracking of multiple objects in 3D point clouds.
Our model exploits temporal information employing multiple frames to detect objects and track them in a single network.
arXiv Detail & Related papers (2022-11-01T20:59:38Z) - OST: Efficient One-stream Network for 3D Single Object Tracking in Point Clouds [6.661881950861012]
We propose a novel one-stream network with the strength of the instance-level encoding, which avoids the correlation operations occurring in previous Siamese network.
The proposed method has achieved considerable performance not only for class-specific tracking but also for class-agnostic tracking with less computation and higher efficiency.
arXiv Detail & Related papers (2022-10-16T12:31:59Z) - 3D Siamese Transformer Network for Single Object Tracking on Point
Clouds [22.48888264770609]
Siamese network based trackers formulate 3D single object tracking as cross-correlation learning between point features of a template and a search area.
We explicitly use Transformer to form a 3D Siamese Transformer network for learning robust cross correlation between the template and the search area.
Our method achieves state-of-the-art performance on the 3D single object tracking task.
arXiv Detail & Related papers (2022-07-25T09:08:30Z) - AutoAlign: Pixel-Instance Feature Aggregation for Multi-Modal 3D Object
Detection [46.03951171790736]
We propose textitAutoAlign, an automatic feature fusion strategy for 3D object detection.
We show that our approach can lead to 2.3 mAP and 7.0 mAP improvements on the KITTI and nuScenes datasets.
arXiv Detail & Related papers (2022-01-17T16:08:57Z) - SASA: Semantics-Augmented Set Abstraction for Point-based 3D Object
Detection [78.90102636266276]
We propose a novel set abstraction method named Semantics-Augmented Set Abstraction (SASA)
Based on the estimated point-wise foreground scores, we then propose a semantics-guided point sampling algorithm to help retain more important foreground points during down-sampling.
In practice, SASA shows to be effective in identifying valuable points related to foreground objects and improving feature learning for point-based 3D detection.
arXiv Detail & Related papers (2022-01-06T08:54:47Z) - Learning Dynamic Compact Memory Embedding for Deformable Visual Object
Tracking [82.34356879078955]
We propose a compact memory embedding to enhance the discrimination of the segmentation-based deformable visual tracking method.
Our method outperforms the excellent segmentation-based trackers, i.e., D3S and SiamMask on DAVIS 2017 benchmark.
arXiv Detail & Related papers (2021-11-23T03:07:12Z) - DFC: Deep Feature Consistency for Robust Point Cloud Registration [0.4724825031148411]
We present a novel learning-based alignment network for complex alignment scenes.
We validate our approach on the 3DMatch dataset and the KITTI odometry dataset.
arXiv Detail & Related papers (2021-11-15T08:27:21Z) - Deep Hough Voting for Robust Global Registration [52.40611370293272]
We present an efficient framework for pairwise registration of real-world 3D scans, leveraging Hough voting in the 6D transformation parameter space.
Our method outperforms state-of-the-art methods on 3DMatch and 3DLoMatch benchmarks while achieving comparable performance on KITTI odometry dataset.
arXiv Detail & Related papers (2021-09-09T14:38:06Z) - Graph Attention Tracking [76.19829750144564]
We propose a simple target-aware Siamese graph attention network for general object tracking.
Experiments on challenging benchmarks including GOT-10k, UAV123, OTB-100 and LaSOT demonstrate that the proposed SiamGAT outperforms many state-of-the-art trackers.
arXiv Detail & Related papers (2020-11-23T04:26:45Z) - Learning Spatio-Appearance Memory Network for High-Performance Visual
Tracking [79.80401607146987]
Existing object tracking usually learns a bounding-box based template to match visual targets across frames, which cannot accurately learn a pixel-wise representation.
This paper presents a novel segmentation-based tracking architecture, which is equipped with a local-temporal memory network to learn accurate-temporal correspondence.
arXiv Detail & Related papers (2020-09-21T08:12:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.