Real-Time Siamese Multiple Object Tracker with Enhanced Proposals
- URL: http://arxiv.org/abs/2202.04966v1
- Date: Thu, 10 Feb 2022 11:41:27 GMT
- Title: Real-Time Siamese Multiple Object Tracker with Enhanced Proposals
- Authors: Lorenzo Vaquero, V\'ictor M. Brea, Manuel Mucientes
- Abstract summary: SiamMOTION allows the tracking of dozens of arbitrary objects in real-time.
It produces quality features through an attention mechanism and a region-of-interest extractor.
SiamMOTION has been validated on five public benchmarks.
- Score: 2.4923006485141284
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Maintaining the identity of multiple objects in real-time video is a
challenging task, as it is not always possible to run a detector on every
frame. Thus, motion estimation systems are often employed, which either do not
scale well with the number of targets or produce features with limited semantic
information. To solve the aforementioned problems and allow the tracking of
dozens of arbitrary objects in real-time, we propose SiamMOTION. SiamMOTION
includes a novel proposal engine that produces quality features through an
attention mechanism and a region-of-interest extractor fed by an inertia module
and powered by a feature pyramid network. Finally, the extracted tensors enter
a comparison head that efficiently matches pairs of exemplars and search areas,
generating quality predictions via a pairwise depthwise region proposal network
and a multi-object penalization module. SiamMOTION has been validated on five
public benchmarks, achieving leading performance against current
state-of-the-art trackers.
Related papers
- STCMOT: Spatio-Temporal Cohesion Learning for UAV-Based Multiple Object Tracking [13.269416985959404]
Multiple object tracking (MOT) in Unmanned Aerial Vehicle (UAV) videos is important for diverse applications in computer vision.
We propose a novel Spatio-Temporal Cohesion Multiple Object Tracking framework (STCMOT)
We use historical embedding features to model the representation of ReID and detection features in a sequential order.
Our framework sets a new state-of-the-art performance in MOTA and IDF1 metrics.
arXiv Detail & Related papers (2024-09-17T14:34:18Z) - CORT: Class-Oriented Real-time Tracking for Embedded Systems [46.3107850275261]
This work proposes a new approach to multi-class object tracking.
It allows achieving smaller and more predictable execution times, without penalizing the tracking performance.
The proposed solution was evaluated in complex urban scenarios with several objects of different types.
arXiv Detail & Related papers (2024-07-20T09:12:17Z) - Transformer Network for Multi-Person Tracking and Re-Identification in
Unconstrained Environment [0.6798775532273751]
Multi-object tracking (MOT) has profound applications in a variety of fields, including surveillance, sports analytics, self-driving, and cooperative robotics.
We put forward an integrated MOT method that marries object detection and identity linkage within a singular, end-to-end trainable framework.
Our system leverages a robust memory-temporal memory module that retains extensive historical observations and effectively encodes them using an attention-based aggregator.
arXiv Detail & Related papers (2023-12-19T08:15:22Z) - Appearance-Based Refinement for Object-Centric Motion Segmentation [85.2426540999329]
We introduce an appearance-based refinement method that leverages temporal consistency in video streams to correct inaccurate flow-based proposals.
Our approach involves a sequence-level selection mechanism that identifies accurate flow-predicted masks as exemplars.
Its performance is evaluated on multiple video segmentation benchmarks, including DAVIS, YouTube, SegTrackv2, and FBMS-59.
arXiv Detail & Related papers (2023-12-18T18:59:51Z) - SpikeMOT: Event-based Multi-Object Tracking with Sparse Motion Features [52.213656737672935]
SpikeMOT is an event-based multi-object tracker.
SpikeMOT uses spiking neural networks to extract sparsetemporal features from event streams associated with objects.
arXiv Detail & Related papers (2023-09-29T05:13:43Z) - Occlusion-Aware Detection and Re-ID Calibrated Network for Multi-Object
Tracking [38.36872739816151]
Occlusion-Aware Attention (OAA) module in the detector highlights the object features while suppressing the occluded background regions.
OAA can serve as a modulator that enhances the detector for some potentially occluded objects.
We design a Re-ID embedding matching block based on the optimal transport problem.
arXiv Detail & Related papers (2023-08-30T06:56:53Z) - Tracking Objects and Activities with Attention for Temporal Sentence
Grounding [51.416914256782505]
Temporal sentence (TSG) aims to localize the temporal segment which is semantically aligned with a natural language query in an untrimmed segment.
We propose a novel Temporal Sentence Tracking Network (TSTNet), which contains (A) a Cross-modal Targets Generator to generate multi-modal and search space, and (B) a Temporal Sentence Tracker to track multi-modal targets' behavior and to predict query-related segment.
arXiv Detail & Related papers (2023-02-21T16:42:52Z) - Scalable Video Object Segmentation with Identification Mechanism [125.4229430216776]
This paper explores the challenges of achieving scalable and effective multi-object modeling for semi-supervised Video Object (VOS)
We present two innovative approaches, Associating Objects with Transformers (AOT) and Associating Objects with Scalable Transformers (AOST)
Our approaches surpass the state-of-the-art competitors and display exceptional efficiency and scalability consistently across all six benchmarks.
arXiv Detail & Related papers (2022-03-22T03:33:27Z) - Looking Beyond Two Frames: End-to-End Multi-Object Tracking Using
Spatial and Temporal Transformers [20.806348407522083]
MO3TR is an end-to-end online multi-object tracking framework.
It encodes object interactions into long-term temporal embeddings.
It tracks initiation and termination without the need for an explicit data association module.
arXiv Detail & Related papers (2021-03-27T07:23:38Z) - Visual Tracking by TridentAlign and Context Embedding [71.60159881028432]
We propose novel TridentAlign and context embedding modules for Siamese network-based visual tracking methods.
The performance of the proposed tracker is comparable to that of state-of-the-art trackers, while the proposed tracker runs at real-time speed.
arXiv Detail & Related papers (2020-07-14T08:00:26Z) - A Unified Object Motion and Affinity Model for Online Multi-Object
Tracking [127.5229859255719]
We propose a novel MOT framework that unifies object motion and affinity model into a single network, named UMA.
UMA integrates single object tracking and metric learning into a unified triplet network by means of multi-task learning.
We equip our model with a task-specific attention module, which is used to boost task-aware feature learning.
arXiv Detail & Related papers (2020-03-25T09:36:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.