Dual form Complementary Masking for Domain-Adaptive Image Segmentation
- URL: http://arxiv.org/abs/2507.12008v1
- Date: Wed, 16 Jul 2025 08:05:22 GMT
- Title: Dual form Complementary Masking for Domain-Adaptive Image Segmentation
- Authors: Jiawen Wang, Yinda Chen, Xiaoyu Liu, Che Liu, Dong Liu, Jianqing Gao, Zhiwei Xiong,
- Abstract summary: We propose MaskTwins, a framework that integrates masked reconstruction directly into the main training pipeline.<n>MaskTwins uncovers intrinsic structural patterns that persist across disparate domains by enforcing consistency between predictions of images masked in complementary ways.<n>These results demonstrate the significant advantages of MaskTwins in extracting domain-invariant features without the need for separate pre-training.
- Score: 44.81357028765057
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Recent works have correlated Masked Image Modeling (MIM) with consistency regularization in Unsupervised Domain Adaptation (UDA). However, they merely treat masking as a special form of deformation on the input images and neglect the theoretical analysis, which leads to a superficial understanding of masked reconstruction and insufficient exploitation of its potential in enhancing feature extraction and representation learning. In this paper, we reframe masked reconstruction as a sparse signal reconstruction problem and theoretically prove that the dual form of complementary masks possesses superior capabilities in extracting domain-agnostic image features. Based on this compelling insight, we propose MaskTwins, a simple yet effective UDA framework that integrates masked reconstruction directly into the main training pipeline. MaskTwins uncovers intrinsic structural patterns that persist across disparate domains by enforcing consistency between predictions of images masked in complementary ways, enabling domain generalization in an end-to-end manner. Extensive experiments verify the superiority of MaskTwins over baseline methods in natural and biological image segmentation. These results demonstrate the significant advantages of MaskTwins in extracting domain-invariant features without the need for separate pre-training, offering a new paradigm for domain-adaptive segmentation.
Related papers
- AdvMIM: Adversarial Masked Image Modeling for Semi-Supervised Medical Image Segmentation [27.35164449801058]
Vision Transformer has recently gained tremendous popularity in medical image segmentation task.<n>Transformer requires a large amount of labeled data to be effective.<n>Key challenge in semi-supervised learning with transformer lies in the lack of sufficient supervision signal.
arXiv Detail & Related papers (2025-06-25T16:00:18Z) - Masked Image Modeling Boosting Semi-Supervised Semantic Segmentation [38.55611683982936]
We introduce a novel class-wise masked image modeling that independently reconstructs different image regions according to their respective classes.
We develop a feature aggregation strategy that minimizes the distances between features corresponding to the masked and visible parts within the same class.
In semantic space, we explore the application of masked image modeling to enhance regularization.
arXiv Detail & Related papers (2024-11-13T16:42:07Z) - LADMIM: Logical Anomaly Detection with Masked Image Modeling in Discrete Latent Space [0.0]
Masked image modeling is a self-supervised learning technique that predicts the feature representation of masked regions in an image.
We propose a novel approach that leverages the characteristics of MIM to detect logical anomalies effectively.
We evaluate the proposed method on the MVTecLOCO dataset, achieving an average AUC of 0.867.
arXiv Detail & Related papers (2024-10-14T07:50:56Z) - A Simple Latent Diffusion Approach for Panoptic Segmentation and Mask Inpainting [2.7563282688229664]
This work builds upon Stable Diffusion and proposes a latent diffusion approach for panoptic segmentation.
Our training consists of two steps: (1) training a shallow autoencoder to project the segmentation masks to latent space; (2) training a diffusion model to allow image-conditioned sampling in latent space.
arXiv Detail & Related papers (2024-01-18T18:59:19Z) - Masking Improves Contrastive Self-Supervised Learning for ConvNets, and Saliency Tells You Where [63.61248884015162]
We aim to alleviate the burden of including masking operation into the contrastive-learning framework for convolutional neural networks.
We propose to explicitly take the saliency constraint into consideration in which the masked regions are more evenly distributed among the foreground and background.
arXiv Detail & Related papers (2023-09-22T09:58:38Z) - Out-of-domain GAN inversion via Invertibility Decomposition for
Photo-Realistic Human Face Manipulation [22.71398343370642]
We propose a novel framework that enhances the fidelity of human face inversion by designing a new module.
Unlike previous works, our invertibility detector is simultaneously learned with a spatial alignment module.
Our method produces photo-realistic results for real-world human face image inversion and manipulation.
arXiv Detail & Related papers (2022-12-19T06:16:58Z) - Editing Out-of-domain GAN Inversion via Differential Activations [56.62964029959131]
We propose a novel GAN prior based editing framework to tackle the out-of-domain inversion problem with a composition-decomposition paradigm.
With the aid of the generated Diff-CAM mask, a coarse reconstruction can intuitively be composited by the paired original and edited images.
In the decomposition phase, we further present a GAN prior based deghosting network for separating the final fine edited image from the coarse reconstruction.
arXiv Detail & Related papers (2022-07-17T10:34:58Z) - Context-Aware Mixup for Domain Adaptive Semantic Segmentation [52.1935168534351]
Unsupervised domain adaptation (UDA) aims to adapt a model of the labeled source domain to an unlabeled target domain.
We propose end-to-end Context-Aware Mixup (CAMix) for domain adaptive semantic segmentation.
Experimental results show that the proposed method outperforms the state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2021-08-08T03:00:22Z) - Image Inpainting with Edge-guided Learnable Bidirectional Attention Maps [85.67745220834718]
We present an edge-guided learnable bidirectional attention map (Edge-LBAM) for improving image inpainting of irregular holes.
Our Edge-LBAM method contains dual procedures,including structure-aware mask-updating guided by predict edges.
Extensive experiments show that our Edge-LBAM is effective in generating coherent image structures and preventing color discrepancy and blurriness.
arXiv Detail & Related papers (2021-04-25T07:25:16Z) - MagGAN: High-Resolution Face Attribute Editing with Mask-Guided
Generative Adversarial Network [145.4591079418917]
MagGAN learns to only edit the facial parts that are relevant to the desired attribute changes.
A novel mask-guided conditioning strategy is introduced to incorporate the influence region of each attribute change into the generator.
A multi-level patch-wise discriminator structure is proposed to scale our model for high-resolution ($1024 times 1024$) face editing.
arXiv Detail & Related papers (2020-10-03T20:56:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.