TRAT: Tracking by Attention Using Spatio-Temporal Features
- URL: http://arxiv.org/abs/2011.09524v1
- Date: Wed, 18 Nov 2020 20:11:12 GMT
- Title: TRAT: Tracking by Attention Using Spatio-Temporal Features
- Authors: Hasan Saribas, Hakan Cevikalp, Okan K\"op\"ukl\"u, Bedirhan Uzun
- Abstract summary: We propose a two-stream deep neural network tracker that uses both spatial and temporal features.
Our architecture is developed over ATOM tracker and contains two backbones: (i) 2D-CNN network to capture appearance features and (ii) 3D-CNN network to capture motion features.
- Score: 14.520067060603209
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Robust object tracking requires knowledge of tracked objects' appearance,
motion and their evolution over time. Although motion provides distinctive and
complementary information especially for fast moving objects, most of the
recent tracking architectures primarily focus on the objects' appearance
information. In this paper, we propose a two-stream deep neural network tracker
that uses both spatial and temporal features. Our architecture is developed
over ATOM tracker and contains two backbones: (i) 2D-CNN network to capture
appearance features and (ii) 3D-CNN network to capture motion features. The
features returned by the two networks are then fused with attention based
Feature Aggregation Module (FAM). Since the whole architecture is unified, it
can be trained end-to-end. The experimental results show that the proposed
tracker TRAT (TRacking by ATtention) achieves state-of-the-art performance on
most of the benchmarks and it significantly outperforms the baseline ATOM
tracker.
Related papers
- SpikeMOT: Event-based Multi-Object Tracking with Sparse Motion Features [52.213656737672935]
SpikeMOT is an event-based multi-object tracker.
SpikeMOT uses spiking neural networks to extract sparsetemporal features from event streams associated with objects.
arXiv Detail & Related papers (2023-09-29T05:13:43Z) - MotionTrack: Learning Motion Predictor for Multiple Object Tracking [68.68339102749358]
We introduce a novel motion-based tracker, MotionTrack, centered around a learnable motion predictor.
Our experimental results demonstrate that MotionTrack yields state-of-the-art performance on datasets such as Dancetrack and SportsMOT.
arXiv Detail & Related papers (2023-06-05T04:24:11Z) - AttTrack: Online Deep Attention Transfer for Multi-object Tracking [4.5116674432168615]
Multi-object tracking (MOT) is a vital component of intelligent video analytics applications such as surveillance and autonomous driving.
In this paper, we aim to accelerate MOT by transferring the knowledge from high-level features of a complex network (teacher) to a lightweight network (student) at both training and inference times.
The proposed AttTrack framework has three key components: 1) cross-model feature learning to align intermediate representations from the teacher and student models, 2) interleaving the execution of the two models at inference time, and 3) incorporating the updated predictions from the teacher model as prior knowledge to assist the student model
arXiv Detail & Related papers (2022-10-16T22:15:31Z) - Minkowski Tracker: A Sparse Spatio-Temporal R-CNN for Joint Object
Detection and Tracking [53.64390261936975]
We present Minkowski Tracker, a sparse-temporal R-CNN that jointly solves object detection and tracking problems.
Inspired by region-based CNN (R-CNN), we propose to track motion as a second stage of the object detector R-CNN.
We show in large-scale experiments that the overall performance gain of our method is due to four factors.
arXiv Detail & Related papers (2022-08-22T04:47:40Z) - 3D-FCT: Simultaneous 3D Object Detection and Tracking Using Feature
Correlation [0.0]
3D-FCT is a Siamese network architecture that utilizes temporal information to simultaneously perform the related tasks of 3D object detection and tracking.
Our proposed method is evaluated on the KITTI tracking dataset where it is shown to provide an improvement of 5.57% mAP over a state-of-the-art approach.
arXiv Detail & Related papers (2021-10-06T06:36:29Z) - Track to Detect and Segment: An Online Multi-Object Tracker [81.15608245513208]
TraDeS is an online joint detection and tracking model, exploiting tracking clues to assist detection end-to-end.
TraDeS infers object tracking offset by a cost volume, which is used to propagate previous object features.
arXiv Detail & Related papers (2021-03-16T02:34:06Z) - DS-Net: Dynamic Spatiotemporal Network for Video Salient Object
Detection [78.04869214450963]
We propose a novel dynamic temporal-temporal network (DSNet) for more effective fusion of temporal and spatial information.
We show that the proposed method achieves superior performance than state-of-the-art algorithms.
arXiv Detail & Related papers (2020-12-09T06:42:30Z) - Robust Visual Object Tracking with Two-Stream Residual Convolutional
Networks [62.836429958476735]
We propose a Two-Stream Residual Convolutional Network (TS-RCN) for visual tracking.
Our TS-RCN can be integrated with existing deep learning based visual trackers.
To further improve the tracking performance, we adopt a "wider" residual network ResNeXt as its feature extraction backbone.
arXiv Detail & Related papers (2020-05-13T19:05:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.