Single-Shot and Multi-Shot Feature Learning for Multi-Object Tracking
- URL: http://arxiv.org/abs/2311.10382v1
- Date: Fri, 17 Nov 2023 08:17:49 GMT
- Title: Single-Shot and Multi-Shot Feature Learning for Multi-Object Tracking
- Authors: Yizhe Li, Sanping Zhou, Zheng Qin, Le Wang, Jinjun Wang, Nanning Zheng
- Abstract summary: We propose a simple yet effective two-stage feature learning paradigm to jointly learn single-shot and multi-shot features for different targets.
Our method has achieved significant improvements on MOT17 and MOT20 datasets while reaching state-of-the-art performance on DanceTrack dataset.
- Score: 55.13878429987136
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-Object Tracking (MOT) remains a vital component of intelligent video
analysis, which aims to locate targets and maintain a consistent identity for
each target throughout a video sequence. Existing works usually learn a
discriminative feature representation, such as motion and appearance, to
associate the detections across frames, which are easily affected by mutual
occlusion and background clutter in practice. In this paper, we propose a
simple yet effective two-stage feature learning paradigm to jointly learn
single-shot and multi-shot features for different targets, so as to achieve
robust data association in the tracking process. For the detections without
being associated, we design a novel single-shot feature learning module to
extract discriminative features of each detection, which can efficiently
associate targets between adjacent frames. For the tracklets being lost several
frames, we design a novel multi-shot feature learning module to extract
discriminative features of each tracklet, which can accurately refind these
lost targets after a long period. Once equipped with a simple data association
logic, the resulting VisualTracker can perform robust MOT based on the
single-shot and multi-shot feature representations. Extensive experimental
results demonstrate that our method has achieved significant improvements on
MOT17 and MOT20 datasets while reaching state-of-the-art performance on
DanceTrack dataset.
Related papers
- FeatureSORT: Essential Features for Effective Tracking [0.0]
We introduce a novel tracker designed for online multiple object tracking with a focus on being simple, while being effective.
By integrating distinct appearance features, including clothing color, style, and target direction, our tracker significantly enhances online tracking accuracy.
arXiv Detail & Related papers (2024-07-05T04:37:39Z) - MotionTrack: Learning Robust Short-term and Long-term Motions for
Multi-Object Tracking [56.92165669843006]
We propose MotionTrack, which learns robust short-term and long-term motions in a unified framework to associate trajectories from a short to long range.
For dense crowds, we design a novel Interaction Module to learn interaction-aware motions from short-term trajectories, which can estimate the complex movement of each target.
For extreme occlusions, we build a novel Refind Module to learn reliable long-term motions from the target's history trajectory, which can link the interrupted trajectory with its corresponding detection.
arXiv Detail & Related papers (2023-03-18T12:38:33Z) - Unifying Tracking and Image-Video Object Detection [54.91658924277527]
TrIVD (Tracking and Image-Video Detection) is the first framework that unifies image OD, video OD, and MOT within one end-to-end model.
To handle the discrepancies and semantic overlaps of category labels, TrIVD formulates detection/tracking as grounding and reasons about object categories.
arXiv Detail & Related papers (2022-11-20T20:30:28Z) - QDTrack: Quasi-Dense Similarity Learning for Appearance-Only Multiple
Object Tracking [73.52284039530261]
We present Quasi-Dense Similarity Learning, which densely samples hundreds of object regions on a pair of images for contrastive learning.
We find that the resulting distinctive feature space admits a simple nearest neighbor search at inference time for object association.
We show that our similarity learning scheme is not limited to video data, but can learn effective instance similarity even from static input.
arXiv Detail & Related papers (2022-10-12T15:47:36Z) - Correlation-Aware Deep Tracking [83.51092789908677]
We propose a novel target-dependent feature network inspired by the self-/cross-attention scheme.
Our network deeply embeds cross-image feature correlation in multiple layers of the feature network.
Our model can be flexibly pre-trained on abundant unpaired images, leading to notably faster convergence than the existing methods.
arXiv Detail & Related papers (2022-03-03T11:53:54Z) - Discriminative Appearance Modeling with Multi-track Pooling for
Real-time Multi-object Tracking [20.66906781151]
In multi-object tracking, the tracker maintains in its memory the appearance and motion information for each object in the scene.
Many approaches model each target in isolation and lack the ability to use all the targets in the scene to jointly update the memory.
We propose a training strategy adapted to multi-track pooling which generates hard tracking episodes online.
arXiv Detail & Related papers (2021-01-28T18:12:39Z) - Multi-object Tracking with a Hierarchical Single-branch Network [31.680667324595557]
We propose an online multi-object tracking framework based on a hierarchical single-branch network.
Our novel iHOIM loss function unifies the objectives of the two sub-tasks and encourages better detection performance.
Experimental results on MOT16 and MOT20 datasets show that we can achieve state-of-the-art tracking performance.
arXiv Detail & Related papers (2021-01-06T12:14:58Z) - End-to-End Multi-Object Tracking with Global Response Map [23.755882375664875]
We present a completely end-to-end approach that takes image-sequence/video as input and outputs directly the located and tracked objects of learned types.
Specifically, with our introduced multi-object representation strategy, a global response map can be accurately generated over frames.
Experimental results based on the MOT16 and MOT17 benchmarks show that our proposed on-line tracker achieved state-of-the-art performance on several tracking metrics.
arXiv Detail & Related papers (2020-07-13T12:30:49Z) - A Unified Object Motion and Affinity Model for Online Multi-Object
Tracking [127.5229859255719]
We propose a novel MOT framework that unifies object motion and affinity model into a single network, named UMA.
UMA integrates single object tracking and metric learning into a unified triplet network by means of multi-task learning.
We equip our model with a task-specific attention module, which is used to boost task-aware feature learning.
arXiv Detail & Related papers (2020-03-25T09:36:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.