AttMOT: Improving Multiple-Object Tracking by Introducing Auxiliary
Pedestrian Attributes
- URL: http://arxiv.org/abs/2308.07537v1
- Date: Tue, 15 Aug 2023 02:39:39 GMT
- Title: AttMOT: Improving Multiple-Object Tracking by Introducing Auxiliary
Pedestrian Attributes
- Authors: Yunhao Li, Zhen Xiao, Lin Yang, Dan Meng, Xin Zhou, Heng Fan, Libo
Zhang
- Abstract summary: We propose a simple, effective, and generic method to predict pedestrian attributes to support general Re-ID embedding.
We first introduce AttMOT, a large, highly enriched synthetic dataset for pedestrian tracking.
We then explore different approaches to fuse Re-ID embedding and pedestrian attributes, including attention mechanisms.
- Score: 33.25021763110573
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-object tracking (MOT) is a fundamental problem in computer vision with
numerous applications, such as intelligent surveillance and automated driving.
Despite the significant progress made in MOT, pedestrian attributes, such as
gender, hairstyle, body shape, and clothing features, which contain rich and
high-level information, have been less explored. To address this gap, we
propose a simple, effective, and generic method to predict pedestrian
attributes to support general Re-ID embedding. We first introduce AttMOT, a
large, highly enriched synthetic dataset for pedestrian tracking, containing
over 80k frames and 6 million pedestrian IDs with different time, weather
conditions, and scenarios. To the best of our knowledge, AttMOT is the first
MOT dataset with semantic attributes. Subsequently, we explore different
approaches to fuse Re-ID embedding and pedestrian attributes, including
attention mechanisms, which we hope will stimulate the development of
attribute-assisted MOT. The proposed method AAM demonstrates its effectiveness
and generality on several representative pedestrian multi-object tracking
benchmarks, including MOT17 and MOT20, through experiments on the AttMOT
dataset. When applied to state-of-the-art trackers, AAM achieves consistent
improvements in MOTA, HOTA, AssA, IDs, and IDF1 scores. For instance, on MOT17,
the proposed method yields a +1.1 MOTA, +1.7 HOTA, and +1.8 IDF1 improvement
when used with FairMOT. To encourage further research on attribute-assisted
MOT, we will release the AttMOT dataset.
Related papers
- Temporal Correlation Meets Embedding: Towards a 2nd Generation of JDE-based Real-Time Multi-Object Tracking [52.04679257903805]
Joint Detection and Embedding (JDE) trackers have demonstrated excellent performance in Multi-Object Tracking (MOT) tasks.
Our tracker, named TCBTrack, achieves state-of-the-art performance on multiple public benchmarks.
arXiv Detail & Related papers (2024-07-19T07:48:45Z) - SOOD++: Leveraging Unlabeled Data to Boost Oriented Object Detection [59.868772767818975]
We propose a simple yet effective Semi-supervised Oriented Object Detection method termed SOOD++.
Specifically, we observe that objects from aerial images are usually arbitrary orientations, small scales, and aggregation.
Extensive experiments conducted on various multi-oriented object datasets under various labeled settings demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2024-07-01T07:03:51Z) - UTOPIA: Unconstrained Tracking Objects without Preliminary Examination
via Cross-Domain Adaptation [26.293108793029297]
Multiple Object Tracking (MOT) aims to find bounding boxes and identities of targeted objects in consecutive video frames.
fully-supervised MOT methods have achieved high accuracy on existing datasets, but cannot generalize well on a newly obtained dataset or a new unseen domain.
In this work, we first address the MOT problem from the cross-domain point of view, imitating the process of new data acquisition in practice.
A new cross-domain MOT adaptation from existing datasets is proposed without any pre-defined human knowledge in understanding and modeling objects.
arXiv Detail & Related papers (2023-06-16T04:06:15Z) - Rt-Track: Robust Tricks for Multi-Pedestrian Tracking [4.271127739716044]
We propose a novel direction consistency method for smooth trajectory prediction (STP-DC) to increase the modeling of motion information.
We also propose a hyper-grain feature embedding network (HG-FEN) to enhance the modeling of appearance models.
To achieve state-of-the-art performance in MOT, we propose a robust tracker named Rt-track, incorporating various tricks and techniques.
arXiv Detail & Related papers (2023-03-16T22:08:29Z) - Multi-Stage Based Feature Fusion of Multi-Modal Data for Human Activity
Recognition [6.0306313759213275]
We propose a multi-modal framework that learns to effectively combine features from RGB Video and IMU sensors.
Our model is trained in two-stage, where in the first stage, each input encoder learns to effectively extract features.
We show significant improvements of 22% and 11% compared to video only, and 20% and 12% on MMAct datasets.
arXiv Detail & Related papers (2022-11-08T15:48:44Z) - SOMPT22: A Surveillance Oriented Multi-Pedestrian Tracking Dataset [5.962184741057505]
We introduce SOMPT22 dataset; a new set for multi person tracking with annotated short videos captured from static cameras located on poles with 6-8 meters in height positioned for city surveillance.
We analyze MOT trackers classified as one-shot and two-stage with respect to the way of use of detection and reID networks on this new dataset.
The experimental results of our new dataset indicate that SOTA is still far from high efficiency, and single-shot trackers are good candidates to unify fast execution and accuracy with competitive performance.
arXiv Detail & Related papers (2022-08-04T11:09:19Z) - SODA10M: Towards Large-Scale Object Detection Benchmark for Autonomous
Driving [94.11868795445798]
We release a Large-Scale Object Detection benchmark for Autonomous driving, named as SODA10M, containing 10 million unlabeled images and 20K images labeled with 6 representative object categories.
To improve diversity, the images are collected every ten seconds per frame within 32 different cities under different weather conditions, periods and location scenes.
We provide extensive experiments and deep analyses of existing supervised state-of-the-art detection models, popular self-supervised and semi-supervised approaches, and some insights about how to develop future models.
arXiv Detail & Related papers (2021-06-21T13:55:57Z) - Detecting 32 Pedestrian Attributes for Autonomous Vehicles [103.87351701138554]
In this paper, we address the problem of jointly detecting pedestrians and recognizing 32 pedestrian attributes.
We introduce a Multi-Task Learning (MTL) model relying on a composite field framework, which achieves both goals in an efficient way.
We show competitive detection and attribute recognition results, as well as a more stable MTL training.
arXiv Detail & Related papers (2020-12-04T15:10:12Z) - MOTChallenge: A Benchmark for Single-Camera Multiple Target Tracking [72.76685780516371]
We present MOTChallenge, a benchmark for single-camera Multiple Object Tracking (MOT)
The benchmark is focused on multiple people tracking, since pedestrians are by far the most studied object in the tracking community.
We provide a categorization of state-of-the-art trackers and a broad error analysis.
arXiv Detail & Related papers (2020-10-15T06:52:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.