Related papers: MVTD: A Benchmark Dataset for Maritime Visual Object Tracking

MVTD: A Benchmark Dataset for Maritime Visual Object Tracking

URL: http://arxiv.org/abs/2506.02866v1
Date: Tue, 03 Jun 2025 13:30:11 GMT
Title: MVTD: A Benchmark Dataset for Maritime Visual Object Tracking
Authors: Ahsan Baidar Bakht, Muhayy Ud Din, Sajid Javed, Irfan Hussain,
Abstract summary: Maritime Visual Tracking dataset (MVTD) comprises 182 high-resolution video sequences, totaling approximately 150,000 frames.<n>MVTD captures a diverse range of operational conditions and maritime scenarios, reflecting the real-world complexities of maritime environments.<n>We evaluated 14 recent SOTA tracking algorithms on the MVTD benchmark and observed substantial performance degradation compared to their performance on general-purpose datasets.
Score: 4.956066467858057
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Visual Object Tracking (VOT) is a fundamental task with widespread applications in autonomous navigation, surveillance, and maritime robotics. Despite significant advances in generic object tracking, maritime environments continue to present unique challenges, including specular water reflections, low-contrast targets, dynamically changing backgrounds, and frequent occlusions. These complexities significantly degrade the performance of state-of-the-art tracking algorithms, highlighting the need for domain-specific datasets. To address this gap, we introduce the Maritime Visual Tracking Dataset (MVTD), a comprehensive and publicly available benchmark specifically designed for maritime VOT. MVTD comprises 182 high-resolution video sequences, totaling approximately 150,000 frames, and includes four representative object classes: boat, ship, sailboat, and unmanned surface vehicle (USV). The dataset captures a diverse range of operational conditions and maritime scenarios, reflecting the real-world complexities of maritime environments. We evaluated 14 recent SOTA tracking algorithms on the MVTD benchmark and observed substantial performance degradation compared to their performance on general-purpose datasets. However, when fine-tuned on MVTD, these models demonstrate significant performance gains, underscoring the effectiveness of domain adaptation and the importance of transfer learning in specialized tracking contexts. The MVTD dataset fills a critical gap in the visual tracking community by providing a realistic and challenging benchmark for maritime scenarios. Dataset and Source Code can be accessed here "https://github.com/AhsanBaidar/MVTD".

Related papers

Benchmarking Vision-Based Object Tracking for USVs in Complex Maritime Environments [0.8796261172196743]
Vision-based target tracking is crucial for unmanned surface vehicles.<n>Real-time tracking in maritime environments is challenging due to dynamic camera movement, low visibility, and scale variation.<n>This study proposes a vision-guided object-tracking framework for USVs.
arXiv Detail & Related papers (2024-12-10T10:35:17Z)
MID: A Comprehensive Shore-Based Dataset for Multi-Scale Dense Ship Occlusion and Interaction Scenarios [10.748210940033484]
The Maritime Ship Navigation Behavior dataset (MID) is designed to address challenges in ship detection within complex maritime environments.<n>MID contains 5,673 images with 135,884 finely annotated target instances, supporting both supervised and semi-supervised learning.<n>MID's images are sourced from high-definition video clips of real-world navigation across 43 water areas, with varied weather and lighting conditions.
arXiv Detail & Related papers (2024-12-08T09:34:23Z)
Underwater Camouflaged Object Tracking Meets Vision-Language SAM2 [60.47622353256502]
We propose the first large-scale multi-modal underwater camouflaged object tracking dataset, namely UW-COT220.<n>Based on the proposed dataset, this work first evaluates current advanced visual object tracking methods, including SAM- and SAM2-based trackers, in challenging underwater environments.<n>Our findings highlight the improvements of SAM2 over SAM, demonstrating its enhanced ability to handle the complexities of underwater camouflaged objects.
arXiv Detail & Related papers (2024-09-25T13:10:03Z)
Introducing VaDA: Novel Image Segmentation Model for Maritime Object Segmentation Using New Dataset [3.468621550644668]
The maritime shipping industry is undergoing rapid evolution driven by advancements in computer vision artificial intelligence (AI) object recognition in maritime environments faces challenges such as light reflection, interference, intense lighting, and various weather conditions. Existing AI recognition models and datasets have limited suitability for composing autonomous navigation systems.
arXiv Detail & Related papers (2024-07-12T05:48:53Z)
Amirkabir campus dataset: Real-world challenges and scenarios of Visual Inertial Odometry (VIO) for visually impaired people [3.7998592843098336]
We introduce the Amirkabir campus dataset (AUT-VI) to address the mentioned problem and improve the navigation systems. AUT-VI is a novel and super-challenging dataset with 126 diverse sequences in 17 different locations. In support of ongoing development efforts, we have released the Android application for data capture to the public.
arXiv Detail & Related papers (2024-01-07T23:13:51Z)
Improving Underwater Visual Tracking With a Large Scale Dataset and Image Enhancement [70.2429155741593]
This paper presents a new dataset and general tracker enhancement method for Underwater Visual Object Tracking (UVOT) It poses distinct challenges; the underwater environment exhibits non-uniform lighting conditions, low visibility, lack of sharpness, low contrast, camouflage, and reflections from suspended particles. We propose a novel underwater image enhancement algorithm designed specifically to boost tracking quality. The method has resulted in a significant performance improvement, of up to 5.0% AUC, of state-of-the-art (SOTA) visual trackers.
arXiv Detail & Related papers (2023-08-30T07:41:26Z)
Vision-Based Autonomous Navigation for Unmanned Surface Vessel in Extreme Marine Conditions [2.8983738640808645]
This paper presents an autonomous vision-based navigation framework for tracking target objects in extreme marine conditions. The proposed framework has been thoroughly tested in simulation under extremely reduced visibility due to sandstorms and fog. The results are compared with state-of-the-art de-hazing methods across the benchmarked MBZIRC simulation dataset.
arXiv Detail & Related papers (2023-08-08T14:25:13Z)
End-to-end Tracking with a Multi-query Transformer [96.13468602635082]
Multiple-object tracking (MOT) is a challenging task that requires simultaneous reasoning about location, appearance, and identity of the objects in the scene over time. Our aim in this paper is to move beyond tracking-by-detection approaches, to class-agnostic tracking that performs well also for unknown object classes.
arXiv Detail & Related papers (2022-10-26T10:19:37Z)
AVisT: A Benchmark for Visual Object Tracking in Adverse Visibility [125.77396380698639]
AVisT is a benchmark for visual tracking in diverse scenarios with adverse visibility. AVisT comprises 120 challenging sequences with 80k annotated frames, spanning 18 diverse scenarios. We benchmark 17 popular and recent trackers on AVisT with detailed analysis of their tracking performance across attributes.
arXiv Detail & Related papers (2022-08-14T17:49:37Z)
Visible-Thermal UAV Tracking: A Large-Scale Benchmark and New Baseline [80.13652104204691]
In this paper, we construct a large-scale benchmark with high diversity for visible-thermal UAV tracking (VTUAV) We provide a coarse-to-fine attribute annotation, where frame-level attributes are provided to exploit the potential of challenge-specific trackers. In addition, we design a new RGB-T baseline, named Hierarchical Multi-modal Fusion Tracker (HMFT), which fuses RGB-T data in various levels.
arXiv Detail & Related papers (2022-04-08T15:22:33Z)
SoDA: Multi-Object Tracking with Soft Data Association [75.39833486073597]
Multi-object tracking (MOT) is a prerequisite for a safe deployment of self-driving cars. We propose a novel approach to MOT that uses attention to compute track embeddings that encode dependencies between observed objects.
arXiv Detail & Related papers (2020-08-18T03:40:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.