Delving into Motion-Aware Matching for Monocular 3D Object Tracking
- URL: http://arxiv.org/abs/2308.11607v1
- Date: Tue, 22 Aug 2023 17:53:58 GMT
- Title: Delving into Motion-Aware Matching for Monocular 3D Object Tracking
- Authors: Kuan-Chih Huang, Ming-Hsuan Yang, Yi-Hsuan Tsai
- Abstract summary: We find that the motion cue of objects along different time frames is critical in 3D multi-object tracking.
We propose MoMA-M3T, a framework that mainly consists of three motion-aware components.
We conduct extensive experiments on the nuScenes and KITTI datasets to demonstrate our MoMA-M3T achieves competitive performance against state-of-the-art methods.
- Score: 81.68608983602581
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances of monocular 3D object detection facilitate the 3D
multi-object tracking task based on low-cost camera sensors. In this paper, we
find that the motion cue of objects along different time frames is critical in
3D multi-object tracking, which is less explored in existing monocular-based
approaches. In this paper, we propose a motion-aware framework for monocular 3D
MOT. To this end, we propose MoMA-M3T, a framework that mainly consists of
three motion-aware components. First, we represent the possible movement of an
object related to all object tracklets in the feature space as its motion
features. Then, we further model the historical object tracklet along the time
frame in a spatial-temporal perspective via a motion transformer. Finally, we
propose a motion-aware matching module to associate historical object tracklets
and current observations as final tracking results. We conduct extensive
experiments on the nuScenes and KITTI datasets to demonstrate that our MoMA-M3T
achieves competitive performance against state-of-the-art methods. Moreover,
the proposed tracker is flexible and can be easily plugged into existing
image-based 3D object detectors without re-training. Code and models are
available at https://github.com/kuanchihhuang/MoMA-M3T.
Related papers
- TrajectoryFormer: 3D Object Tracking Transformer with Predictive
Trajectory Hypotheses [51.60422927416087]
3D multi-object tracking (MOT) is vital for many applications including autonomous driving vehicles and service robots.
We present TrajectoryFormer, a novel point-cloud-based 3D MOT framework.
arXiv Detail & Related papers (2023-06-09T13:31:50Z) - TripletTrack: 3D Object Tracking using Triplet Embeddings and LSTM [0.0]
3D object tracking is a critical task in autonomous driving systems.
In this paper we investigate the use of triplet embeddings in combination with motion representations for 3D object tracking.
arXiv Detail & Related papers (2022-10-28T15:23:50Z) - A Simple Baseline for Multi-Camera 3D Object Detection [94.63944826540491]
3D object detection with surrounding cameras has been a promising direction for autonomous driving.
We present SimMOD, a Simple baseline for Multi-camera Object Detection.
We conduct extensive experiments on the 3D object detection benchmark of nuScenes to demonstrate the effectiveness of SimMOD.
arXiv Detail & Related papers (2022-08-22T03:38:01Z) - Time3D: End-to-End Joint Monocular 3D Object Detection and Tracking for
Autonomous Driving [3.8073142980733]
We propose jointly training 3D detection and 3D tracking from only monocular videos in an end-to-end manner.
Time3D achieves 21.4% AMOTA, 13.6% AMOTP on the nuScenes 3D tracking benchmark, surpassing all published competitors.
arXiv Detail & Related papers (2022-05-30T06:41:10Z) - Exploring Optical-Flow-Guided Motion and Detection-Based Appearance for
Temporal Sentence Grounding [61.57847727651068]
Temporal sentence grounding aims to localize a target segment in an untrimmed video semantically according to a given sentence query.
Most previous works focus on learning frame-level features of each whole frame in the entire video, and directly match them with the textual information.
We propose a novel Motion- and Appearance-guided 3D Semantic Reasoning Network (MA3SRN), which incorporates optical-flow-guided motion-aware, detection-based appearance-aware, and 3D-aware object-level features.
arXiv Detail & Related papers (2022-03-06T13:57:09Z) - Monocular Quasi-Dense 3D Object Tracking [99.51683944057191]
A reliable and accurate 3D tracking framework is essential for predicting future locations of surrounding objects and planning the observer's actions in numerous applications such as autonomous driving.
We propose a framework that can effectively associate moving objects over time and estimate their full 3D bounding box information from a sequence of 2D images captured on a moving platform.
arXiv Detail & Related papers (2021-03-12T15:30:02Z) - Tracking from Patterns: Learning Corresponding Patterns in Point Clouds
for 3D Object Tracking [34.40019455462043]
We propose to learn 3D object correspondences from temporal point cloud data and infer the motion information from correspondence patterns.
Our method exceeds the existing 3D tracking methods on both the KITTI and larger scale Nuscenes dataset.
arXiv Detail & Related papers (2020-10-20T06:07:20Z) - Kinematic 3D Object Detection in Monocular Video [123.7119180923524]
We propose a novel method for monocular video-based 3D object detection which carefully leverages kinematic motion to improve precision of 3D localization.
We achieve state-of-the-art performance on monocular 3D object detection and the Bird's Eye View tasks within the KITTI self-driving dataset.
arXiv Detail & Related papers (2020-07-19T01:15:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.