STURE: Spatial-Temporal Mutual Representation Learning for Robust Data
Association in Online Multi-Object Tracking
- URL: http://arxiv.org/abs/2201.06824v2
- Date: Wed, 19 Jan 2022 02:49:30 GMT
- Title: STURE: Spatial-Temporal Mutual Representation Learning for Robust Data
Association in Online Multi-Object Tracking
- Authors: Haidong Wang, Zhiyong Li, Yaping Li, Ke Nai, Ming Wen
- Abstract summary: The proposed approach is capable of extracting more distinguishing detection and sequence representations.
It is applied to the public MOT challenge benchmarks and performs well compared with various state-of-the-art online MOT trackers.
- Score: 7.562844934117318
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Online multi-object tracking (MOT) is a longstanding task for computer vision
and intelligent vehicle platform. At present, the main paradigm is
tracking-by-detection, and the main difficulty of this paradigm is how to
associate the current candidate detection with the historical tracklets.
However, in the MOT scenarios, each historical tracklet is composed of an
object sequence, while each candidate detection is just a flat image, which
lacks the temporal features of the object sequence. The feature difference
between current candidate detection and historical tracklets makes the object
association much harder. Therefore, we propose a Spatial-Temporal Mutual
{Representation} Learning (STURE) approach which learns spatial-temporal
representations between current candidate detection and historical sequence in
a mutual representation space. For the historical trackelets, the detection
learning network is forced to match the representations of sequence learning
network in a mutual representation space. The proposed approach is capable of
extracting more distinguishing detection and sequence representations by using
various designed losses in object association. As a result, spatial-temporal
feature is learned mutually to reinforce the current detection features, and
the feature difference can be relieved. To prove the robustness of the STURE,
it is applied to the public MOT challenge benchmarks and performs well compared
with various state-of-the-art online MOT trackers based on identity-preserving
metrics.
Related papers
- STCMOT: Spatio-Temporal Cohesion Learning for UAV-Based Multiple Object Tracking [13.269416985959404]
Multiple object tracking (MOT) in Unmanned Aerial Vehicle (UAV) videos is important for diverse applications in computer vision.
We propose a novel Spatio-Temporal Cohesion Multiple Object Tracking framework (STCMOT)
We use historical embedding features to model the representation of ReID and detection features in a sequential order.
Our framework sets a new state-of-the-art performance in MOTA and IDF1 metrics.
arXiv Detail & Related papers (2024-09-17T14:34:18Z) - Temporal Correlation Meets Embedding: Towards a 2nd Generation of JDE-based Real-Time Multi-Object Tracking [52.04679257903805]
Joint Detection and Embedding (JDE) trackers have demonstrated excellent performance in Multi-Object Tracking (MOT) tasks.
Our tracker, named TCBTrack, achieves state-of-the-art performance on multiple public benchmarks.
arXiv Detail & Related papers (2024-07-19T07:48:45Z) - Lost and Found: Overcoming Detector Failures in Online Multi-Object Tracking [15.533652456081374]
Multi-object tracking (MOT) endeavors to precisely estimate identities and positions of multiple objects over time.
Modern detectors may occasionally miss some objects in certain frames, causing trackers to cease tracking prematurely.
We propose BUSCA, meaning to search', a versatile framework compatible with any online TbD system.
arXiv Detail & Related papers (2024-07-14T10:45:12Z) - Spatial-Temporal Graph Enhanced DETR Towards Multi-Frame 3D Object Detection [54.041049052843604]
We present STEMD, a novel end-to-end framework that enhances the DETR-like paradigm for multi-frame 3D object detection.
First, to model the inter-object spatial interaction and complex temporal dependencies, we introduce the spatial-temporal graph attention network.
Finally, it poses a challenge for the network to distinguish between the positive query and other highly similar queries that are not the best match.
arXiv Detail & Related papers (2023-07-01T13:53:14Z) - Tracking Objects and Activities with Attention for Temporal Sentence
Grounding [51.416914256782505]
Temporal sentence (TSG) aims to localize the temporal segment which is semantically aligned with a natural language query in an untrimmed segment.
We propose a novel Temporal Sentence Tracking Network (TSTNet), which contains (A) a Cross-modal Targets Generator to generate multi-modal and search space, and (B) a Temporal Sentence Tracker to track multi-modal targets' behavior and to predict query-related segment.
arXiv Detail & Related papers (2023-02-21T16:42:52Z) - Spatio-Temporal Point Process for Multiple Object Tracking [30.041104276095624]
Multiple Object Tracking (MOT) focuses on modeling the relationship of detected objects among consecutive frames and merge them into different trajectories.
We propose a novel framework that can effectively predict and mask-out noisy and confusing detection results before associating objects into trajectories.
arXiv Detail & Related papers (2023-02-05T18:14:08Z) - End-to-end Tracking with a Multi-query Transformer [96.13468602635082]
Multiple-object tracking (MOT) is a challenging task that requires simultaneous reasoning about location, appearance, and identity of the objects in the scene over time.
Our aim in this paper is to move beyond tracking-by-detection approaches, to class-agnostic tracking that performs well also for unknown object classes.
arXiv Detail & Related papers (2022-10-26T10:19:37Z) - Multi-Object Tracking and Segmentation with a Space-Time Memory Network [12.043574473965318]
We propose a method for multi-object tracking and segmentation based on a novel memory-based mechanism to associate tracklets.
The proposed tracker, MeNToS, addresses particularly the long-term data association problem.
arXiv Detail & Related papers (2021-10-21T17:13:17Z) - Learning to Track with Object Permanence [61.36492084090744]
We introduce an end-to-end trainable approach for joint object detection and tracking.
Our model, trained jointly on synthetic and real data, outperforms the state of the art on KITTI, and MOT17 datasets.
arXiv Detail & Related papers (2021-03-26T04:43:04Z) - Learning to associate detections for real-time multiple object tracking [0.0]
This study investigates the use of artificial neural networks to learn a similarity function that can be used among detections.
The proposed tracker matches the results obtained by state-of-the-art methods, it has run 58% faster than a recent and similar method, used as baseline.
arXiv Detail & Related papers (2020-07-12T17:08:41Z) - Stance Detection Benchmark: How Robust Is Your Stance Detection? [65.91772010586605]
Stance Detection (StD) aims to detect an author's stance towards a certain topic or claim.
We introduce a StD benchmark that learns from ten StD datasets of various domains in a multi-dataset learning setting.
Within this benchmark setup, we are able to present new state-of-the-art results on five of the datasets.
arXiv Detail & Related papers (2020-01-06T13:37:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.