DMSORT: An efficient parallel maritime multi-object tracking architecture for unmanned vessel platforms
- URL: http://arxiv.org/abs/2511.04128v1
- Date: Thu, 06 Nov 2025 07:20:36 GMT
- Title: DMSORT: An efficient parallel maritime multi-object tracking architecture for unmanned vessel platforms
- Authors: Shengyu Tang, Zeyuan Lu, Jiazhi Dong, Changdong Yu, Xiaoyu Wang, Yaohui Lyu, Weihao Xia,
- Abstract summary: Multi-object tracking (MOT) is essential for ensuring safe vessel navigation and effective maritime surveillance.<n>We propose an efficient Dual-branch Maritime SORT (DMSORT) method for maritime MOT.<n>DMSORT attains the fastest runtime among existing ReID-based MOT frameworks.
- Score: 4.093759316439195
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Accurate perception of the marine environment through robust multi-object tracking (MOT) is essential for ensuring safe vessel navigation and effective maritime surveillance. However, the complicated maritime environment often causes camera motion and subsequent visual degradation, posing significant challenges to MOT. To address this challenge, we propose an efficient Dual-branch Maritime SORT (DMSORT) method for maritime MOT. The core of the framework is a parallel tracker with affine compensation, which incorporates an object detection and re-identification (ReID) branch, along with a dedicated branch for dynamic camera motion estimation. Specifically, a Reversible Columnar Detection Network (RCDN) is integrated into the detection module to leverage multi-level visual features for robust object detection. Furthermore, a lightweight Transformer-based appearance extractor (Li-TAE) is designed to capture global contextual information and generate robust appearance features. Another branch decouples platform-induced and target-intrinsic motion by constructing a projective transformation, applying platform-motion compensation within the Kalman filter, and thereby stabilizing true object trajectories. Finally, a clustering-optimized feature fusion module effectively combines motion and appearance cues to ensure identity consistency under noise, occlusion, and drift. Extensive evaluations on the Singapore Maritime Dataset demonstrate that DMSORT achieves state-of-the-art performance. Notably, DMSORT attains the fastest runtime among existing ReID-based MOT frameworks while maintaining high identity consistency and robustness to jitter and occlusion. Code is available at: https://github.com/BiscuitsLzy/DMSORT-An-efficient-parallel-maritime-multi-object-tracking-architect ure-.
Related papers
- IoUCert: Robustness Verification for Anchor-based Object Detectors [58.35703549470485]
We introduce IoUCert, a novel formal verification framework designed specifically to overcome these bottlenecks in anchor-based object detection architectures.<n>We show that our method enables the robustness verification of realistic, anchor-based models including SSD, YOLOv2, and YOLOv3 variants against various input perturbations.
arXiv Detail & Related papers (2026-03-03T14:36:46Z) - HAD: Hierarchical Asymmetric Distillation to Bridge Spatio-Temporal Gaps in Event-Based Object Tracking [80.07224739976911]
Event cameras offer exceptional temporal resolution and a range (modal)<n> RGB cameras excel at capturing rich texture with high resolution, whereas event cameras offer exceptional temporal resolution and a range (modal)
arXiv Detail & Related papers (2025-10-22T13:15:13Z) - Neptune-X: Active X-to-Maritime Generation for Universal Maritime Object Detection [54.1960918379255]
Neptune-X is a data-centric generative-selection framework for maritime object detection.<n>X-to-Maritime is a multi-modality-conditioned generative model that synthesizes diverse and realistic maritime scenes.<n>Our approach sets a new benchmark in maritime scene synthesis, significantly improving detection accuracy.
arXiv Detail & Related papers (2025-09-25T04:59:02Z) - Tracking the Unstable: Appearance-Guided Motion Modeling for Robust Multi-Object Tracking in UAV-Captured Videos [58.156141601478794]
Multi-object tracking (UAVT) aims to track multiple objects while maintaining consistent identities across frames of a given video.<n>Existing methods typically model motion cues and appearance separately, overlooking their interplay and resulting in suboptimal tracking performance.<n>We propose AMOT, which exploits appearance and motion cues through two key components: an Appearance-Motion Consistency (AMC) matrix and a Motion-aware Track Continuation (MTC) module.
arXiv Detail & Related papers (2025-08-03T12:06:47Z) - YOLOv8-SMOT: An Efficient and Robust Framework for Real-Time Small Object Tracking via Slice-Assisted Training and Adaptive Association [5.07987775511372]
This paper details our championship-winning solution in the MVA 2025 "Finding Birds" Small Multi-Object Tracking Challenge (SMOT4SB)<n>It adopts the tracking-by-detection paradigm with targeted innovations at both the detection and association levels.<n>Our method achieves state-of-the-art performance on the SMOT4SB public test set, reaching an SO-HOTA score of textbf55.205.
arXiv Detail & Related papers (2025-07-16T09:51:19Z) - VRS-UIE: Value-Driven Reordering Scanning for Underwater Image Enhancement [104.78586859995333]
State Space Models (SSMs) have emerged as a promising backbone for vision tasks due to their linear complexity and global receptive field.<n>The predominance of large-portion, homogeneous but useless oceanic backgrounds can dilute the feature representation responses of sparse yet valuable targets.<n>We propose a novel Value-Driven Reordering Scanning framework for Underwater Image Enhancement (UIE)<n>Our framework sets a new state-of-the-art, delivering superior enhancement performance (surpassing WMamba by 0.89 dB on average) by effectively suppressing water bias and preserving structural and color fidelity.
arXiv Detail & Related papers (2025-05-02T12:21:44Z) - WS-DETR: Robust Water Surface Object Detection through Vision-Radar Fusion with Detection Transformer [4.768265044725289]
Water surface object detection faces challenges from blurred edges and diverse object scales.<n>Existing approaches suffer from cross-modal feature conflicts, which negatively affect model robustness.<n>We propose a robust vision-radar fusion model WS-DETR, which achieves state-of-the-art (SOTA) performance.
arXiv Detail & Related papers (2025-04-10T04:16:46Z) - Benchmarking Vision-Based Object Tracking for USVs in Complex Maritime Environments [0.8796261172196743]
Vision-based target tracking is crucial for unmanned surface vehicles.<n>Real-time tracking in maritime environments is challenging due to dynamic camera movement, low visibility, and scale variation.<n>This study proposes a vision-guided object-tracking framework for USVs.
arXiv Detail & Related papers (2024-12-10T10:35:17Z) - STCMOT: Spatio-Temporal Cohesion Learning for UAV-Based Multiple Object Tracking [13.269416985959404]
Multiple object tracking (MOT) in Unmanned Aerial Vehicle (UAV) videos is important for diverse applications in computer vision.
We propose a novel Spatio-Temporal Cohesion Multiple Object Tracking framework (STCMOT)
We use historical embedding features to model the representation of ReID and detection features in a sequential order.
Our framework sets a new state-of-the-art performance in MOTA and IDF1 metrics.
arXiv Detail & Related papers (2024-09-17T14:34:18Z) - Ego-Motion Aware Target Prediction Module for Robust Multi-Object Tracking [2.7898966850590625]
We introduce a novel KF-based prediction module called Ego-motion Aware Target Prediction (EMAP)
Our proposed method decouples the impact of camera rotational and translational velocity from the object trajectories by reformulating the Kalman Filter.
EMAP remarkably drops the number of identity switches (IDSW) of OC-SORT and Deep OC-SORT by 73% and 21%, respectively.
arXiv Detail & Related papers (2024-04-03T23:24:25Z) - SpikeMOT: Event-based Multi-Object Tracking with Sparse Motion Features [52.213656737672935]
SpikeMOT is an event-based multi-object tracker.
SpikeMOT uses spiking neural networks to extract sparsetemporal features from event streams associated with objects.
arXiv Detail & Related papers (2023-09-29T05:13:43Z) - Joint Spatial-Temporal and Appearance Modeling with Transformer for
Multiple Object Tracking [59.79252390626194]
We propose a novel solution named TransSTAM, which leverages Transformer to model both the appearance features of each object and the spatial-temporal relationships among objects.
The proposed method is evaluated on multiple public benchmarks including MOT16, MOT17, and MOT20, and it achieves a clear performance improvement in both IDF1 and HOTA.
arXiv Detail & Related papers (2022-05-31T01:19:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.