Is First Person Vision Challenging for Object Tracking?
- URL: http://arxiv.org/abs/2108.13665v1
- Date: Tue, 31 Aug 2021 08:06:01 GMT
- Title: Is First Person Vision Challenging for Object Tracking?
- Authors: Matteo Dunnhofer, Antonino Furnari, Giovanni Maria Farinella,
Christian Micheloni
- Abstract summary: We present the first systematic study of object tracking in First Person Vision (FPV)
Our study extensively analyses the performance of recent visual trackers and baseline FPV trackers with respect to different aspects and considering a new performance measure.
Our results show that object tracking in FPV is challenging, which suggests that more research efforts should be devoted to this problem.
- Score: 32.64792520537041
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Understanding human-object interactions is fundamental in First Person Vision
(FPV). Tracking algorithms which follow the objects manipulated by the camera
wearer can provide useful cues to effectively model such interactions. Visual
tracking solutions available in the computer vision literature have
significantly improved their performance in the last years for a large variety
of target objects and tracking scenarios. However, despite a few previous
attempts to exploit trackers in FPV applications, a methodical analysis of the
performance of state-of-the-art trackers in this domain is still missing. In
this paper, we fill the gap by presenting the first systematic study of object
tracking in FPV. Our study extensively analyses the performance of recent
visual trackers and baseline FPV trackers with respect to different aspects and
considering a new performance measure. This is achieved through TREK-150, a
novel benchmark dataset composed of 150 densely annotated video sequences. Our
results show that object tracking in FPV is challenging, which suggests that
more research efforts should be devoted to this problem so that tracking could
benefit FPV tasks.
Related papers
- SFTrack: A Robust Scale and Motion Adaptive Algorithm for Tracking Small and Fast Moving Objects [2.9803250365852443]
This paper addresses the problem of multi-object tracking in Unmanned Aerial Vehicle (UAV) footage.
It plays a critical role in various UAV applications, including traffic monitoring systems and real-time suspect tracking by the police.
We propose a new tracking strategy, which initiates the tracking of target objects from low-confidence detections.
arXiv Detail & Related papers (2024-10-26T05:09:20Z) - Tracking Reflected Objects: A Benchmark [12.770787846444406]
We introduce TRO, a benchmark specifically for Tracking Reflected Objects.
TRO includes 200 sequences with around 70,000 frames, each carefully annotated with bounding boxes.
To provide a stronger baseline, we propose a new tracker, HiP-HaTrack, which uses hierarchical features to improve performance.
arXiv Detail & Related papers (2024-07-07T02:22:45Z) - Tracking with Human-Intent Reasoning [64.69229729784008]
This work proposes a new tracking task -- Instruction Tracking.
It involves providing implicit tracking instructions that require the trackers to perform tracking automatically in video frames.
TrackGPT is capable of performing complex reasoning-based tracking.
arXiv Detail & Related papers (2023-12-29T03:22:18Z) - BEVTrack: A Simple and Strong Baseline for 3D Single Object Tracking in Bird's-Eye View [56.77287041917277]
3D Single Object Tracking (SOT) is a fundamental task of computer vision, proving essential for applications like autonomous driving.
In this paper, we propose BEVTrack, a simple yet effective baseline method.
By estimating the target motion in Bird's-Eye View (BEV) to perform tracking, BEVTrack demonstrates surprising simplicity from various aspects, i.e., network designs, training objectives, and tracking pipeline, while achieving superior performance.
arXiv Detail & Related papers (2023-09-05T12:42:26Z) - Tracking Anything in High Quality [63.63653185865726]
HQTrack is a framework for High Quality Tracking anything in videos.
It consists of a video multi-object segmenter (VMOS) and a mask refiner (MR)
arXiv Detail & Related papers (2023-07-26T06:19:46Z) - Visual Object Tracking in First Person Vision [33.62651949312872]
The study is made possible through the introduction of TREK-150, a novel benchmark dataset composed of 150 densely annotated video sequences.
Our results show that object tracking in FPV poses new challenges to current visual trackers.
arXiv Detail & Related papers (2022-09-27T16:18:47Z) - AVisT: A Benchmark for Visual Object Tracking in Adverse Visibility [125.77396380698639]
AVisT is a benchmark for visual tracking in diverse scenarios with adverse visibility.
AVisT comprises 120 challenging sequences with 80k annotated frames, spanning 18 diverse scenarios.
We benchmark 17 popular and recent trackers on AVisT with detailed analysis of their tracking performance across attributes.
arXiv Detail & Related papers (2022-08-14T17:49:37Z) - Is First Person Vision Challenging for Object Tracking? [33.62651949312872]
This paper provides a recap of the first systematic study of object tracking in First Person Vision (FPV)
Our work extensively analyses the performance of recent and baseline FPV trackers with respect to different aspects.
The results suggest that more research efforts should be devoted to this problem so that tracking could benefit FPV tasks.
arXiv Detail & Related papers (2020-11-24T18:18:15Z) - TAO: A Large-Scale Benchmark for Tracking Any Object [95.87310116010185]
Tracking Any Object dataset consists of 2,907 high resolution videos, captured in diverse environments, which are half a minute long on average.
We ask annotators to label objects that move at any point in the video, and give names to them post factum.
Our vocabulary is both significantly larger and qualitatively different from existing tracking datasets.
arXiv Detail & Related papers (2020-05-20T21:07:28Z) - Robust Visual Object Tracking with Two-Stream Residual Convolutional
Networks [62.836429958476735]
We propose a Two-Stream Residual Convolutional Network (TS-RCN) for visual tracking.
Our TS-RCN can be integrated with existing deep learning based visual trackers.
To further improve the tracking performance, we adopt a "wider" residual network ResNeXt as its feature extraction backbone.
arXiv Detail & Related papers (2020-05-13T19:05:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.