Visual Object Tracking in First Person Vision
- URL: http://arxiv.org/abs/2209.13502v1
- Date: Tue, 27 Sep 2022 16:18:47 GMT
- Title: Visual Object Tracking in First Person Vision
- Authors: Matteo Dunnhofer, Antonino Furnari, Giovanni Maria Farinella,
Christian Micheloni
- Abstract summary: The study is made possible through the introduction of TREK-150, a novel benchmark dataset composed of 150 densely annotated video sequences.
Our results show that object tracking in FPV poses new challenges to current visual trackers.
- Score: 33.62651949312872
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The understanding of human-object interactions is fundamental in First Person
Vision (FPV). Visual tracking algorithms which follow the objects manipulated
by the camera wearer can provide useful information to effectively model such
interactions. In the last years, the computer vision community has
significantly improved the performance of tracking algorithms for a large
variety of target objects and scenarios. Despite a few previous attempts to
exploit trackers in the FPV domain, a methodical analysis of the performance of
state-of-the-art trackers is still missing. This research gap raises the
question of whether current solutions can be used ``off-the-shelf'' or more
domain-specific investigations should be carried out. This paper aims to
provide answers to such questions. We present the first systematic
investigation of single object tracking in FPV. Our study extensively analyses
the performance of 42 algorithms including generic object trackers and baseline
FPV-specific trackers. The analysis is carried out by focusing on different
aspects of the FPV setting, introducing new performance measures, and in
relation to FPV-specific tasks. The study is made possible through the
introduction of TREK-150, a novel benchmark dataset composed of 150 densely
annotated video sequences. Our results show that object tracking in FPV poses
new challenges to current visual trackers. We highlight the factors causing
such behavior and point out possible research directions. Despite their
difficulties, we prove that trackers bring benefits to FPV downstream tasks
requiring short-term object tracking. We expect that generic object tracking
will gain popularity in FPV as new and FPV-specific methodologies are
investigated.
Related papers
- SFTrack: A Robust Scale and Motion Adaptive Algorithm for Tracking Small and Fast Moving Objects [2.9803250365852443]
This paper addresses the problem of multi-object tracking in Unmanned Aerial Vehicle (UAV) footage.
It plays a critical role in various UAV applications, including traffic monitoring systems and real-time suspect tracking by the police.
We propose a new tracking strategy, which initiates the tracking of target objects from low-confidence detections.
arXiv Detail & Related papers (2024-10-26T05:09:20Z) - Tracking Reflected Objects: A Benchmark [12.770787846444406]
We introduce TRO, a benchmark specifically for Tracking Reflected Objects.
TRO includes 200 sequences with around 70,000 frames, each carefully annotated with bounding boxes.
To provide a stronger baseline, we propose a new tracker, HiP-HaTrack, which uses hierarchical features to improve performance.
arXiv Detail & Related papers (2024-07-07T02:22:45Z) - Tracking with Human-Intent Reasoning [64.69229729784008]
This work proposes a new tracking task -- Instruction Tracking.
It involves providing implicit tracking instructions that require the trackers to perform tracking automatically in video frames.
TrackGPT is capable of performing complex reasoning-based tracking.
arXiv Detail & Related papers (2023-12-29T03:22:18Z) - BEVTrack: A Simple and Strong Baseline for 3D Single Object Tracking in Bird's-Eye View [56.77287041917277]
3D Single Object Tracking (SOT) is a fundamental task of computer vision, proving essential for applications like autonomous driving.
In this paper, we propose BEVTrack, a simple yet effective baseline method.
By estimating the target motion in Bird's-Eye View (BEV) to perform tracking, BEVTrack demonstrates surprising simplicity from various aspects, i.e., network designs, training objectives, and tracking pipeline, while achieving superior performance.
arXiv Detail & Related papers (2023-09-05T12:42:26Z) - Tracking Anything in High Quality [63.63653185865726]
HQTrack is a framework for High Quality Tracking anything in videos.
It consists of a video multi-object segmenter (VMOS) and a mask refiner (MR)
arXiv Detail & Related papers (2023-07-26T06:19:46Z) - AVisT: A Benchmark for Visual Object Tracking in Adverse Visibility [125.77396380698639]
AVisT is a benchmark for visual tracking in diverse scenarios with adverse visibility.
AVisT comprises 120 challenging sequences with 80k annotated frames, spanning 18 diverse scenarios.
We benchmark 17 popular and recent trackers on AVisT with detailed analysis of their tracking performance across attributes.
arXiv Detail & Related papers (2022-08-14T17:49:37Z) - Is First Person Vision Challenging for Object Tracking? [32.64792520537041]
We present the first systematic study of object tracking in First Person Vision (FPV)
Our study extensively analyses the performance of recent visual trackers and baseline FPV trackers with respect to different aspects and considering a new performance measure.
Our results show that object tracking in FPV is challenging, which suggests that more research efforts should be devoted to this problem.
arXiv Detail & Related papers (2021-08-31T08:06:01Z) - Detecting Invisible People [58.49425715635312]
We re-purpose tracking benchmarks and propose new metrics for the task of detecting invisible objects.
We demonstrate that current detection and tracking systems perform dramatically worse on this task.
Second, we build dynamic models that explicitly reason in 3D, making use of observations produced by state-of-the-art monocular depth estimation networks.
arXiv Detail & Related papers (2020-12-15T16:54:45Z) - Is First Person Vision Challenging for Object Tracking? [33.62651949312872]
This paper provides a recap of the first systematic study of object tracking in First Person Vision (FPV)
Our work extensively analyses the performance of recent and baseline FPV trackers with respect to different aspects.
The results suggest that more research efforts should be devoted to this problem so that tracking could benefit FPV tasks.
arXiv Detail & Related papers (2020-11-24T18:18:15Z) - In the Eye of the Beholder: Gaze and Actions in First Person Video [30.54510882243602]
We address the task of jointly determining what a person is doing and where they are looking based on the analysis of video captured by a headworn camera.
Our dataset comes with videos, gaze tracking data, hand masks and action annotations.
We propose a novel deep model for joint gaze estimation and action recognition in First Person Vision.
arXiv Detail & Related papers (2020-05-31T22:06:06Z) - Robust Visual Object Tracking with Two-Stream Residual Convolutional
Networks [62.836429958476735]
We propose a Two-Stream Residual Convolutional Network (TS-RCN) for visual tracking.
Our TS-RCN can be integrated with existing deep learning based visual trackers.
To further improve the tracking performance, we adopt a "wider" residual network ResNeXt as its feature extraction backbone.
arXiv Detail & Related papers (2020-05-13T19:05:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.