UMDATrack: Unified Multi-Domain Adaptive Tracking Under Adverse Weather Conditions
- URL: http://arxiv.org/abs/2507.00648v1
- Date: Tue, 01 Jul 2025 10:43:57 GMT
- Title: UMDATrack: Unified Multi-Domain Adaptive Tracking Under Adverse Weather Conditions
- Authors: Siyuan Yao, Rui Zhu, Ziqi Wang, Wenqi Ren, Yanyang Yan, Xiaochun Cao,
- Abstract summary: We propose UMDATrack, which maintains high-quality target state prediction under various adverse weather conditions.<n>Our code is available at https://github.com/Z-Z188/UMDATrack.
- Score: 73.71632291123008
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Visual object tracking has gained promising progress in past decades. Most of the existing approaches focus on learning target representation in well-conditioned daytime data, while for the unconstrained real-world scenarios with adverse weather conditions, e.g. nighttime or foggy environment, the tremendous domain shift leads to significant performance degradation. In this paper, we propose UMDATrack, which is capable of maintaining high-quality target state prediction under various adverse weather conditions within a unified domain adaptation framework. Specifically, we first use a controllable scenario generator to synthesize a small amount of unlabeled videos (less than 2% frames in source daytime datasets) in multiple weather conditions under the guidance of different text prompts. Afterwards, we design a simple yet effective domain-customized adapter (DCA), allowing the target objects' representation to rapidly adapt to various weather conditions without redundant model updating. Furthermore, to enhance the localization consistency between source and target domains, we propose a target-aware confidence alignment module (TCA) following optimal transport theorem. Extensive experiments demonstrate that UMDATrack can surpass existing advanced visual trackers and lead new state-of-the-art performance by a significant margin. Our code is available at https://github.com/Z-Z188/UMDATrack.
Related papers
- Multi-Modality Driven LoRA for Adverse Condition Depth Estimation [61.525312117638116]
We propose Multi-Modality Driven LoRA (MMD-LoRA) for Adverse Condition Depth Estimation.<n>It consists of two core components: Prompt Driven Domain Alignment (PDDA) and Visual-Text Consistent Contrastive Learning (VTCCL)<n>It achieves state-of-the-art performance on the nuScenes and Oxford RobotCar datasets.
arXiv Detail & Related papers (2024-12-28T14:23:58Z) - Exploiting Multimodal Spatial-temporal Patterns for Video Object Tracking [53.33637391723555]
We propose a unified multimodal spatial-temporal tracking approach named STTrack.<n>In contrast to previous paradigms, we introduced a temporal state generator (TSG) that continuously generates a sequence of tokens containing multimodal temporal information.<n>These temporal information tokens are used to guide the localization of the target in the next time state, establish long-range contextual relationships between video frames, and capture the temporal trajectory of the target.
arXiv Detail & Related papers (2024-12-20T09:10:17Z) - STCMOT: Spatio-Temporal Cohesion Learning for UAV-Based Multiple Object Tracking [13.269416985959404]
Multiple object tracking (MOT) in Unmanned Aerial Vehicle (UAV) videos is important for diverse applications in computer vision.
We propose a novel Spatio-Temporal Cohesion Multiple Object Tracking framework (STCMOT)
We use historical embedding features to model the representation of ReID and detection features in a sequential order.
Our framework sets a new state-of-the-art performance in MOTA and IDF1 metrics.
arXiv Detail & Related papers (2024-09-17T14:34:18Z) - Cooperative Students: Navigating Unsupervised Domain Adaptation in Nighttime Object Detection [1.6624384368855527]
Unsupervised Domain Adaptation (UDA) has shown significant advancements in object detection under well-lit conditions.
UDA's performance degrades notably in low-visibility scenarios, especially at night.
To address this problem, we propose a textbfCooperative textbfStudents (textbfCoS) framework.
arXiv Detail & Related papers (2024-04-02T14:26:18Z) - ControlUDA: Controllable Diffusion-assisted Unsupervised Domain
Adaptation for Cross-Weather Semantic Segmentation [14.407346832155042]
ControlUDA is a diffusion-assisted framework tailored for UDA segmentation under adverse weather conditions.
It first leverages target prior from a pre-trained segmentor for tuning the DM, compensating the missing target domain labels.
It also contains UDAControlNet, a condition-fused multi-scale and prompt-enhanced network targeted at high-fidelity data generation in adverse weathers.
arXiv Detail & Related papers (2024-02-09T14:48:20Z) - SpikeMOT: Event-based Multi-Object Tracking with Sparse Motion Features [52.213656737672935]
SpikeMOT is an event-based multi-object tracker.
SpikeMOT uses spiking neural networks to extract sparsetemporal features from event streams associated with objects.
arXiv Detail & Related papers (2023-09-29T05:13:43Z) - Towards Real-World Visual Tracking with Temporal Contexts [64.7981374129495]
We propose a two-level framework (TCTrack) that can exploit temporal contexts efficiently.
Based on it, we propose a stronger version for real-world visual tracking, i.e., TCTrack++.
For feature extraction, we propose an attention-based temporally adaptive convolution to enhance the spatial features.
For similarity map refinement, we introduce an adaptive temporal transformer to encode the temporal knowledge efficiently.
arXiv Detail & Related papers (2023-08-20T17:59:40Z) - Domain Adaptive Object Detection for Autonomous Driving under Foggy
Weather [25.964194141706923]
This paper proposes a novel domain adaptive object detection framework for autonomous driving under foggy weather.
Our method leverages both image-level and object-level adaptation to diminish the domain discrepancy in image style and object appearance.
Experimental results on public benchmarks show the effectiveness and accuracy of the proposed method.
arXiv Detail & Related papers (2022-10-27T05:09:10Z) - AVisT: A Benchmark for Visual Object Tracking in Adverse Visibility [125.77396380698639]
AVisT is a benchmark for visual tracking in diverse scenarios with adverse visibility.
AVisT comprises 120 challenging sequences with 80k annotated frames, spanning 18 diverse scenarios.
We benchmark 17 popular and recent trackers on AVisT with detailed analysis of their tracking performance across attributes.
arXiv Detail & Related papers (2022-08-14T17:49:37Z) - An Unsupervised Domain Adaptive Approach for Multimodal 2D Object
Detection in Adverse Weather Conditions [5.217255784808035]
We propose an unsupervised domain adaptation framework to bridge the domain gap between source and target domains.
We use a data augmentation scheme that simulates weather distortions to add domain confusion and prevent overfitting on the source data.
Experiments performed on the DENSE dataset show that our method can substantially alleviate the domain gap.
arXiv Detail & Related papers (2022-03-07T18:10:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.