Related papers: MaskAdapt: Unsupervised Geometry-Aware Domain Adaptation Using Multimodal Contextual Learning and RGB-Depth Masking

MaskAdapt: Unsupervised Geometry-Aware Domain Adaptation Using Multimodal Contextual Learning and RGB-Depth Masking

URL: http://arxiv.org/abs/2505.24026v1
Date: Thu, 29 May 2025 21:59:39 GMT
Title: MaskAdapt: Unsupervised Geometry-Aware Domain Adaptation Using Multimodal Contextual Learning and RGB-Depth Masking
Authors: Numair Nadeem, Muhammad Hamza Asad, Saeed Anwar, Abdul Bais,
Abstract summary: MaskAdapt is a novel approach that enhances segmentation accuracy through multimodal contextual learning.<n>By computing depth gradients from depth maps, our method captures spatial transitions that help resolve texture ambiguities.<n> Evaluations on real agricultural datasets demonstrate that MaskAdapt consistently outperforms existing State-of-the-Art (SOTA) methods.
Score: 16.926158907882012
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Semantic segmentation of crops and weeds is crucial for site-specific farm management; however, most existing methods depend on labor intensive pixel-level annotations. A further challenge arises when models trained on one field (source domain) fail to generalize to new fields (target domain) due to domain shifts, such as variations in lighting, camera setups, soil composition, and crop growth stages. Unsupervised Domain Adaptation (UDA) addresses this by enabling adaptation without target-domain labels, but current UDA methods struggle with occlusions and visual blending between crops and weeds, leading to misclassifications in real-world conditions. To overcome these limitations, we introduce MaskAdapt, a novel approach that enhances segmentation accuracy through multimodal contextual learning by integrating RGB images with features derived from depth data. By computing depth gradients from depth maps, our method captures spatial transitions that help resolve texture ambiguities. These gradients, through a cross-attention mechanism, refines RGB feature representations, resulting in sharper boundary delineation. In addition, we propose a geometry-aware masking strategy that applies horizontal, vertical, and stochastic masks during training. This encourages the model to focus on the broader spatial context for robust visual recognition. Evaluations on real agricultural datasets demonstrate that MaskAdapt consistently outperforms existing State-of-the-Art (SOTA) UDA methods, achieving improved segmentation mean Intersection over Union (mIOU) across diverse field conditions.

Related papers

Unsupervised Modality Adaptation with Text-to-Image Diffusion Models for Semantic Segmentation [54.96563068182733]
We propose Modality Adaptation with text-to-image Diffusion Models (MADM) for semantic segmentation task. MADM utilizes text-to-image diffusion models pre-trained on extensive image-text pairs to enhance the model's cross-modality capabilities. We show that MADM achieves state-of-the-art adaptation performance across various modality tasks, including images to depth, infrared, and event modalities.
arXiv Detail & Related papers (2024-10-29T03:49:40Z)
SiamSeg: Self-Training with Contrastive Learning for Unsupervised Domain Adaptation Semantic Segmentation in Remote Sensing [13.549403813487022]
Unsupervised domain adaptation (UDA) enables models to learn from unlabeled target domain data while leveraging labeled source domain data.<n>We propose integrating contrastive learning into UDA, enhancing the model's ability to capture semantic information in the target domain.<n>Our method, SimSeg, outperforms existing approaches, achieving state-of-the-art results.
arXiv Detail & Related papers (2024-10-17T11:59:39Z)
MICDrop: Masking Image and Depth Features via Complementary Dropout for Domain-Adaptive Semantic Segmentation [155.0797148367653]
Unsupervised Domain Adaptation (UDA) is the task of bridging the domain gap between a labeled source domain and an unlabeled target domain. We propose to leverage geometric information, i.e., depth predictions, as depth discontinuities often coincide with segmentation boundaries. We show that our method can be plugged into various recent UDA methods and consistently improve results across standard UDA benchmarks.
arXiv Detail & Related papers (2024-08-29T12:15:10Z)
BLADE: Box-Level Supervised Amodal Segmentation through Directed Expansion [10.57956193654977]
Box-level supervised amodal segmentation addresses this challenge by relying solely on ground truth bounding boxes and instance classes as supervision. We present a novel solution by introducing a directed expansion approach from visible masks to corresponding amodal masks. Our approach involves a hybrid end-to-end network based on the overlapping region - the area where different instances intersect.
arXiv Detail & Related papers (2024-01-03T09:37:03Z)
Box-Adapt: Domain-Adaptive Medical Image Segmentation using Bounding BoxSupervision [52.45336255472669]
We propose a weakly supervised do-main adaptation setting for deep learning. Box-Adapt fully explores the fine-grained segmenta-tion mask in the source domain and the weak bounding box in the target domain. We demonstrate the effectiveness of our method in the liver segmentation task.
arXiv Detail & Related papers (2021-08-19T01:51:04Z)
Context-Aware Mixup for Domain Adaptive Semantic Segmentation [52.1935168534351]
Unsupervised domain adaptation (UDA) aims to adapt a model of the labeled source domain to an unlabeled target domain. We propose end-to-end Context-Aware Mixup (CAMix) for domain adaptive semantic segmentation. Experimental results show that the proposed method outperforms the state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2021-08-08T03:00:22Z)
AFAN: Augmented Feature Alignment Network for Cross-Domain Object Detection [90.18752912204778]
Unsupervised domain adaptation for object detection is a challenging problem with many real-world applications. We propose a novel augmented feature alignment network (AFAN) which integrates intermediate domain image generation and domain-adversarial training. Our approach significantly outperforms the state-of-the-art methods on standard benchmarks for both similar and dissimilar domain adaptations.
arXiv Detail & Related papers (2021-06-10T05:01:20Z)
SAND-mask: An Enhanced Gradient Masking Strategy for the Discovery of Invariances in Domain Generalization [7.253255826783766]
We propose a masking strategy, which determines a continuous weight based on the agreement of gradients that flow in each edge of network. SAND-mask is validated over the Domainbed benchmark for domain generalization.
arXiv Detail & Related papers (2021-06-04T05:20:54Z)
Spatial Attention Pyramid Network for Unsupervised Domain Adaptation [66.75008386980869]
Unsupervised domain adaptation is critical in various computer vision tasks. We design a new spatial attention pyramid network for unsupervised domain adaptation. Our method performs favorably against the state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2020-03-29T09:03:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.