SiamMOT: Siamese Multi-Object Tracking
- URL: http://arxiv.org/abs/2105.11595v1
- Date: Tue, 25 May 2021 01:09:26 GMT
- Title: SiamMOT: Siamese Multi-Object Tracking
- Authors: Bing Shuai, Andrew Berneshawi, Xinyu Li, Davide Modolo, Joseph Tighe
- Abstract summary: We introduce a region-based Siamese Multi-Object Tracking network, which we name SiamMOT.
SiamMOT includes a motion model that estimates the instance's movement between two frames such that detected instances are associated.
SiamMOT is efficient, and it runs at 17 FPS for 720P videos on a single modern GPU.
- Score: 28.97401838563374
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we focus on improving online multi-object tracking (MOT). In
particular, we introduce a region-based Siamese Multi-Object Tracking network,
which we name SiamMOT. SiamMOT includes a motion model that estimates the
instance's movement between two frames such that detected instances are
associated. To explore how the motion modelling affects its tracking
capability, we present two variants of Siamese tracker, one that implicitly
models motion and one that models it explicitly. We carry out extensive
quantitative experiments on three different MOT datasets: MOT17, TAO-person and
Caltech Roadside Pedestrians, showing the importance of motion modelling for
MOT and the ability of SiamMOT to substantially outperform the
state-of-the-art. Finally, SiamMOT also outperforms the winners of ACM MM'20
HiEve Grand Challenge on HiEve dataset. Moreover, SiamMOT is efficient, and it
runs at 17 FPS for 720P videos on a single modern GPU. Codes are available in
\url{https://github.com/amazon-research/siam-mot}.
Related papers
- MV-MOS: Multi-View Feature Fusion for 3D Moving Object Segmentation [4.386035726986601]
How to effectively utilize motion and semantic features and avoid information loss during 3D-to-2D projection is still a key challenge.
We propose a novel multi-view MOS model (MV-MOS) by fusing motion-semantic features from different 2D representations of point clouds.
We validated the effectiveness of the proposed multi-branch fusion MOS framework via comprehensive experiments.
arXiv Detail & Related papers (2024-08-20T07:30:00Z) - SiamMo: Siamese Motion-Centric 3D Object Tracking [12.68616041331354]
We introduce SiamMo, a novel and simple motion-centric tracking approach.
Unlike the traditional single-stream architecture, we introduce Siamese feature extraction for motion-centric tracking.
SiamMo sets a new record on the KITTI tracking benchmark with 90.1% precision while maintaining a high inference speed of 108 FPS.
arXiv Detail & Related papers (2024-08-03T07:02:01Z) - Visible-Thermal Multiple Object Tracking: Large-scale Video Dataset and Progressive Fusion Approach [17.286142856787222]
We contribute a large-scale Visible-Thermal video benchmark for Multiple Object Tracking (MOT) called VT-MOT.
VT-MOT includes 582 video sequence pairs, 401k frame pairs from surveillance, drone, and handheld platforms.
A comprehensive experiment are conducted on VT-MOT and the results prove the superiority and effectiveness of the proposed method.
arXiv Detail & Related papers (2024-08-02T01:29:43Z) - Delving into Motion-Aware Matching for Monocular 3D Object Tracking [81.68608983602581]
We find that the motion cue of objects along different time frames is critical in 3D multi-object tracking.
We propose MoMA-M3T, a framework that mainly consists of three motion-aware components.
We conduct extensive experiments on the nuScenes and KITTI datasets to demonstrate our MoMA-M3T achieves competitive performance against state-of-the-art methods.
arXiv Detail & Related papers (2023-08-22T17:53:58Z) - Tracking Anything in High Quality [63.63653185865726]
HQTrack is a framework for High Quality Tracking anything in videos.
It consists of a video multi-object segmenter (VMOS) and a mask refiner (MR)
arXiv Detail & Related papers (2023-07-26T06:19:46Z) - MotionTrack: Learning Motion Predictor for Multiple Object Tracking [68.68339102749358]
We introduce a novel motion-based tracker, MotionTrack, centered around a learnable motion predictor.
Our experimental results demonstrate that MotionTrack yields state-of-the-art performance on datasets such as Dancetrack and SportsMOT.
arXiv Detail & Related papers (2023-06-05T04:24:11Z) - SportsMOT: A Large Multi-Object Tracking Dataset in Multiple Sports
Scenes [44.46768991505495]
We present a new large-scale multi-object tracking dataset in diverse sports scenes, coined as emphSportsMOT.
It consists of 240 video sequences, over 150K frames and over 1.6M bounding boxes collected from 3 sports categories, including basketball, volleyball and football.
We propose a new multi-object tracking framework, termed as emphMixSort, introducing a MixFormer-like structure as an auxiliary association model to prevailing tracking-by-detection trackers.
arXiv Detail & Related papers (2023-04-11T12:07:31Z) - An Effective Motion-Centric Paradigm for 3D Single Object Tracking in
Point Clouds [50.19288542498838]
3D single object tracking in LiDAR point clouds (LiDAR SOT) plays a crucial role in autonomous driving.
Current approaches all follow the Siamese paradigm based on appearance matching.
We introduce a motion-centric paradigm to handle LiDAR SOT from a new perspective.
arXiv Detail & Related papers (2023-03-21T17:28:44Z) - SMILEtrack: SiMIlarity LEarning for Occlusion-Aware Multiple Object
Tracking [20.286114226299237]
This paper introduces SMILEtrack, an innovative object tracker with a Siamese network-based Similarity Learning Module (SLM)
The SLM calculates the appearance similarity between two objects, overcoming the limitations of feature descriptors in Separate Detection and Embedding models.
Second, we develop a Similarity Matching Cascade (SMC) module with a novel GATE function for robust object matching across consecutive video frames.
arXiv Detail & Related papers (2022-11-16T10:49:48Z) - Know Your Surroundings: Panoramic Multi-Object Tracking by Multimodality
Collaboration [56.01625477187448]
We propose a MultiModality PAnoramic multi-object Tracking framework (MMPAT)
It takes both 2D panorama images and 3D point clouds as input and then infers target trajectories using the multimodality data.
We evaluate the proposed method on the JRDB dataset, where the MMPAT achieves the top performance in both the detection and tracking tasks.
arXiv Detail & Related papers (2021-05-31T03:16:38Z) - Simultaneous Detection and Tracking with Motion Modelling for Multiple
Object Tracking [94.24393546459424]
We introduce Deep Motion Modeling Network (DMM-Net) that can estimate multiple objects' motion parameters to perform joint detection and association.
DMM-Net achieves PR-MOTA score of 12.80 @ 120+ fps for the popular UA-DETRAC challenge, which is better performance and orders of magnitude faster.
We also contribute a synthetic large-scale public dataset Omni-MOT for vehicle tracking that provides precise ground-truth annotations.
arXiv Detail & Related papers (2020-08-20T08:05:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.