TOPIC: A Parallel Association Paradigm for Multi-Object Tracking under
Complex Motions and Diverse Scenes
- URL: http://arxiv.org/abs/2308.11157v1
- Date: Tue, 22 Aug 2023 03:30:22 GMT
- Title: TOPIC: A Parallel Association Paradigm for Multi-Object Tracking under
Complex Motions and Diverse Scenes
- Authors: Xiaoyan Cao, Yiyao Zheng, Yao Yao, Huapeng Qin, Xiaoyu Cao, Shihui Guo
- Abstract summary: We introduce a new dataset called BEE23 to highlight complex motions.
We propose a parallel paradigm and present the Two rOund Parallel matchIng meChanism (TOPIC) to implement it.
Our approach achieves state-of-the-art performance on four public datasets and BEE23.
- Score: 17.913501787851356
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Video data and algorithms have been driving advances in multi-object tracking
(MOT). While existing MOT datasets focus on occlusion and appearance
similarity, complex motion patterns are widespread yet overlooked. To address
this issue, we introduce a new dataset called BEE23 to highlight complex
motions. Identity association algorithms have long been the focus of MOT
research. Existing trackers can be categorized into two association paradigms:
single-feature paradigm (based on either motion or appearance feature) and
serial paradigm (one feature serves as secondary while the other is primary).
However, these paradigms are incapable of fully utilizing different features.
In this paper, we propose a parallel paradigm and present the Two rOund
Parallel matchIng meChanism (TOPIC) to implement it. The TOPIC leverages both
motion and appearance features and can adaptively select the preferable one as
the assignment metric based on motion level. Moreover, we provide an
Attention-based Appearance Reconstruct Module (AARM) to reconstruct appearance
feature embeddings, thus enhancing the representation of appearance features.
Comprehensive experiments show that our approach achieves state-of-the-art
performance on four public datasets and BEE23. Notably, our proposed parallel
paradigm surpasses the performance of existing association paradigms by a large
margin, e.g., reducing false negatives by 12% to 51% compared to the
single-feature association paradigm. The introduced dataset and association
paradigm in this work offers a fresh perspective for advancing the MOT field.
The source code and dataset are available at
https://github.com/holmescao/TOPICTrack.
Related papers
- UnsMOT: Unified Framework for Unsupervised Multi-Object Tracking with
Geometric Topology Guidance [6.577227592760559]
UnsMOT is a novel framework that combines appearance and motion features of objects with geometric information to provide more accurate tracking.
Experimental results show remarkable performance in terms of HOTA, IDF1, and MOTA metrics in comparison with state-of-the-art methods.
arXiv Detail & Related papers (2023-09-03T04:58:12Z) - Motion-to-Matching: A Mixed Paradigm for 3D Single Object Tracking [27.805298263103495]
We propose MTM-Tracker, which combines motion modeling with feature matching into a single network.
In the first stage, we exploit the continuous historical boxes as motion prior and propose an encoder-decoder structure to locate target coarsely.
In the second stage, we introduce a feature interaction module to extract motion-aware features from consecutive point clouds and match them to refine target movement.
arXiv Detail & Related papers (2023-08-23T02:40:51Z) - Unified Visual Relationship Detection with Vision and Language Models [89.77838890788638]
This work focuses on training a single visual relationship detector predicting over the union of label spaces from multiple datasets.
We propose UniVRD, a novel bottom-up method for Unified Visual Relationship Detection by leveraging vision and language models.
Empirical results on both human-object interaction detection and scene-graph generation demonstrate the competitive performance of our model.
arXiv Detail & Related papers (2023-03-16T00:06:28Z) - SMILEtrack: SiMIlarity LEarning for Occlusion-Aware Multiple Object
Tracking [20.286114226299237]
This paper introduces SMILEtrack, an innovative object tracker with a Siamese network-based Similarity Learning Module (SLM)
The SLM calculates the appearance similarity between two objects, overcoming the limitations of feature descriptors in Separate Detection and Embedding models.
Second, we develop a Similarity Matching Cascade (SMC) module with a novel GATE function for robust object matching across consecutive video frames.
arXiv Detail & Related papers (2022-11-16T10:49:48Z) - Joint Spatial-Temporal and Appearance Modeling with Transformer for
Multiple Object Tracking [59.79252390626194]
We propose a novel solution named TransSTAM, which leverages Transformer to model both the appearance features of each object and the spatial-temporal relationships among objects.
The proposed method is evaluated on multiple public benchmarks including MOT16, MOT17, and MOT20, and it achieves a clear performance improvement in both IDF1 and HOTA.
arXiv Detail & Related papers (2022-05-31T01:19:18Z) - Exploring Motion and Appearance Information for Temporal Sentence
Grounding [52.01687915910648]
We propose a Motion-Appearance Reasoning Network (MARN) to solve temporal sentence grounding.
We develop separate motion and appearance branches to learn motion-guided and appearance-guided object relations.
Our proposed MARN significantly outperforms previous state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2022-01-03T02:44:18Z) - Online Multiple Object Tracking with Cross-Task Synergy [120.70085565030628]
We propose a novel unified model with synergy between position prediction and embedding association.
The two tasks are linked by temporal-aware target attention and distractor attention, as well as identity-aware memory aggregation model.
arXiv Detail & Related papers (2021-04-01T10:19:40Z) - Dense Scene Multiple Object Tracking with Box-Plane Matching [73.54369833671772]
Multiple Object Tracking (MOT) is an important task in computer vision.
We propose the Box-Plane Matching (BPM) method to improve the MOT performacne in dense scenes.
With the effectiveness of the three modules, our team achieves the 1st place on the Track-1 leaderboard in the ACM MM Grand Challenge HiEve 2020.
arXiv Detail & Related papers (2020-07-30T16:39:22Z) - Segment as Points for Efficient Online Multi-Object Tracking and
Segmentation [66.03023110058464]
We propose a highly effective method for learning instance embeddings based on segments by converting the compact image representation to un-ordered 2D point cloud representation.
Our method generates a new tracking-by-points paradigm where discriminative instance embeddings are learned from randomly selected points rather than images.
The resulting online MOTS framework, named PointTrack, surpasses all the state-of-the-art methods by large margins.
arXiv Detail & Related papers (2020-07-03T08:29:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.