mvHOTA: A multi-view higher order tracking accuracy metric to measure
spatial and temporal associations in multi-point detection
- URL: http://arxiv.org/abs/2206.09372v1
- Date: Sun, 19 Jun 2022 10:31:53 GMT
- Title: mvHOTA: A multi-view higher order tracking accuracy metric to measure
spatial and temporal associations in multi-point detection
- Authors: Lalith Sharan, Halvar Kelm, Gabriele Romano, Matthias Karck, Raffaele
De Simone, Sandy Engelhardt
- Abstract summary: Multi-object tracking (MOT) is a challenging task that involves detecting objects in the scene and tracking them across a sequence of frames.
The main evaluation metric to benchmark MOT methods on datasets such as KITTI has recently become the higher order tracking accuracy (HOTA) metric.
We propose a multi-view higher order tracking metric (mvHOTA) to determine the accuracy of multi-point (multi-instance and multi-class) detection.
- Score: 1.039718070553655
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Multi-object tracking (MOT) is a challenging task that involves detecting
objects in the scene and tracking them across a sequence of frames. Evaluating
this task is difficult due to temporal occlusions, and varying trajectories
across a sequence of images. The main evaluation metric to benchmark MOT
methods on datasets such as KITTI has recently become the higher order tracking
accuracy (HOTA) metric, which is capable of providing a better description of
the performance over metrics such as MOTA, DetA, and IDF1. Point detection and
tracking is a closely related task, which could be regarded as a special case
of object detection. However, there are differences in evaluating the detection
task itself (point distances vs. bounding box overlap). When including the
temporal dimension and multi-view scenarios, the evaluation task becomes even
more complex. In this work, we propose a multi-view higher order tracking
metric (mvHOTA) to determine the accuracy of multi-point (multi-instance and
multi-class) detection, while taking into account temporal and spatial
associations. mvHOTA can be interpreted as the geometric mean of the detection,
association, and correspondence accuracies, thereby providing equal weighting
to each of the factors. We demonstrate a use-case through a publicly available
endoscopic point detection dataset from a previously organised medical
challenge. Furthermore, we compare with other adjusted MOT metrics for this
use-case, discuss the properties of mvHOTA, and show how the proposed
correspondence accuracy and the Occlusion index facilitate analysis of methods
with respect to handling of occlusions. The code will be made publicly
available.
Related papers
- STCMOT: Spatio-Temporal Cohesion Learning for UAV-Based Multiple Object Tracking [13.269416985959404]
Multiple object tracking (MOT) in Unmanned Aerial Vehicle (UAV) videos is important for diverse applications in computer vision.
We propose a novel Spatio-Temporal Cohesion Multiple Object Tracking framework (STCMOT)
We use historical embedding features to model the representation of ReID and detection features in a sequential order.
Our framework sets a new state-of-the-art performance in MOTA and IDF1 metrics.
arXiv Detail & Related papers (2024-09-17T14:34:18Z) - SparseTrack: Multi-Object Tracking by Performing Scene Decomposition
based on Pseudo-Depth [84.64121608109087]
We propose a pseudo-depth estimation method for obtaining the relative depth of targets from 2D images.
Secondly, we design a depth cascading matching (DCM) algorithm, which can use the obtained depth information to convert a dense target set into multiple sparse target subsets.
By integrating the pseudo-depth method and the DCM strategy into the data association process, we propose a new tracker, called SparseTrack.
arXiv Detail & Related papers (2023-06-08T14:36:10Z) - Joint Counting, Detection and Re-Identification for Multi-Object
Tracking [8.89262850257871]
In crowded scenes, joint detection and tracking usually fail to find accurate object associations due to missed or false detections.
We jointly model counting, detection and re-identification in an end-to-end framework, named CountingMOT, tailored for crowded scenes.
The proposed MOT tracker can perform online and real-time tracking, and achieves the state-of-the-art results on public benchmarks MOT16 (MOTA of 79.7), MOT17 (MOTA of 81.3%) and MOT20 (MOTA of 78.9%)
arXiv Detail & Related papers (2022-12-12T12:53:58Z) - End-to-end Tracking with a Multi-query Transformer [96.13468602635082]
Multiple-object tracking (MOT) is a challenging task that requires simultaneous reasoning about location, appearance, and identity of the objects in the scene over time.
Our aim in this paper is to move beyond tracking-by-detection approaches, to class-agnostic tracking that performs well also for unknown object classes.
arXiv Detail & Related papers (2022-10-26T10:19:37Z) - Tracking Every Thing in the Wild [61.917043381836656]
We introduce a new metric, Track Every Thing Accuracy (TETA), breaking tracking measurement into three sub-factors: localization, association, and classification.
Our experiments show that TETA evaluates trackers more comprehensively, and TETer achieves significant improvements on the challenging large-scale datasets BDD100K and TAO.
arXiv Detail & Related papers (2022-07-26T15:37:19Z) - Visible-Thermal UAV Tracking: A Large-Scale Benchmark and New Baseline [80.13652104204691]
In this paper, we construct a large-scale benchmark with high diversity for visible-thermal UAV tracking (VTUAV)
We provide a coarse-to-fine attribute annotation, where frame-level attributes are provided to exploit the potential of challenge-specific trackers.
In addition, we design a new RGB-T baseline, named Hierarchical Multi-modal Fusion Tracker (HMFT), which fuses RGB-T data in various levels.
arXiv Detail & Related papers (2022-04-08T15:22:33Z) - STURE: Spatial-Temporal Mutual Representation Learning for Robust Data
Association in Online Multi-Object Tracking [7.562844934117318]
The proposed approach is capable of extracting more distinguishing detection and sequence representations.
It is applied to the public MOT challenge benchmarks and performs well compared with various state-of-the-art online MOT trackers.
arXiv Detail & Related papers (2022-01-18T08:52:40Z) - Multi-Object Tracking and Segmentation with a Space-Time Memory Network [12.043574473965318]
We propose a method for multi-object tracking and segmentation based on a novel memory-based mechanism to associate tracklets.
The proposed tracker, MeNToS, addresses particularly the long-term data association problem.
arXiv Detail & Related papers (2021-10-21T17:13:17Z) - HOTA: A Higher Order Metric for Evaluating Multi-Object Tracking [48.497889944886516]
Multi-Object Tracking (MOT) has been notoriously difficult to evaluate.
Previous metrics overemphasize the importance of either detection or association.
We present a novel MOT evaluation metric, HOTA, which balances the effect of performing accurate detection, association and localization.
arXiv Detail & Related papers (2020-09-16T15:11:30Z) - Tracking-by-Counting: Using Network Flows on Crowd Density Maps for
Tracking Multiple Targets [96.98888948518815]
State-of-the-art multi-object tracking(MOT) methods follow the tracking-by-detection paradigm.
We propose a new MOT paradigm, tracking-by-counting, tailored for crowded scenes.
arXiv Detail & Related papers (2020-07-18T19:51:53Z) - End-to-End Multi-Object Tracking with Global Response Map [23.755882375664875]
We present a completely end-to-end approach that takes image-sequence/video as input and outputs directly the located and tracked objects of learned types.
Specifically, with our introduced multi-object representation strategy, a global response map can be accurately generated over frames.
Experimental results based on the MOT16 and MOT17 benchmarks show that our proposed on-line tracker achieved state-of-the-art performance on several tracking metrics.
arXiv Detail & Related papers (2020-07-13T12:30:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.