Masked Collaborative Contrast for Weakly Supervised Semantic
Segmentation
- URL: http://arxiv.org/abs/2305.08491v6
- Date: Wed, 8 Nov 2023 09:28:58 GMT
- Title: Masked Collaborative Contrast for Weakly Supervised Semantic
Segmentation
- Authors: Fangwen Wu, Jingxuan He, Yufei Yin, Yanbin Hao, Gang Huang, Lechao
Cheng
- Abstract summary: Masked Collaborative Contrast (MCC) to highlight semantic regions in weakly supervised semantic segmentation.
MCC adroitly draws inspiration from masked image modeling and contrastive learning to devise a novel framework that induces keys to contract toward semantic regions.
- Score: 22.74105261883464
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This study introduces an efficacious approach, Masked Collaborative Contrast
(MCC), to highlight semantic regions in weakly supervised semantic
segmentation. MCC adroitly draws inspiration from masked image modeling and
contrastive learning to devise a novel framework that induces keys to contract
toward semantic regions. Unlike prevalent techniques that directly eradicate
patch regions in the input image when generating masks, we scrutinize the
neighborhood relations of patch tokens by exploring masks considering keys on
the affinity matrix. Moreover, we generate positive and negative samples in
contrastive learning by utilizing the masked local output and contrasting it
with the global output. Elaborate experiments on commonly employed datasets
evidences that the proposed MCC mechanism effectively aligns global and local
perspectives within the image, attaining impressive performance. The source
code is available at \url{https://github.com/fwu11/MCC}.
Related papers
- Masked Image Modeling Boosting Semi-Supervised Semantic Segmentation [38.55611683982936]
We introduce a novel class-wise masked image modeling that independently reconstructs different image regions according to their respective classes.
We develop a feature aggregation strategy that minimizes the distances between features corresponding to the masked and visible parts within the same class.
In semantic space, we explore the application of masked image modeling to enhance regularization.
arXiv Detail & Related papers (2024-11-13T16:42:07Z) - DIAL: Dense Image-text ALignment for Weakly Supervised Semantic Segmentation [8.422110274212503]
Weakly supervised semantic segmentation approaches typically rely on class activation maps (CAMs) for initial seed generation.
We introduce DALNet, which leverages text embeddings to enhance the comprehensive understanding and precise localization of objects across different levels of granularity.
Our approach, in particular, allows for more efficient end-to-end process as a single-stage method.
arXiv Detail & Related papers (2024-09-24T06:51:49Z) - Understanding Masked Autoencoders From a Local Contrastive Perspective [80.57196495601826]
Masked AutoEncoder (MAE) has revolutionized the field of self-supervised learning with its simple yet effective masking and reconstruction strategies.
We introduce a new empirical framework, called Local Contrastive MAE, to analyze both reconstructive and contrastive aspects of MAE.
arXiv Detail & Related papers (2023-10-03T12:08:15Z) - MixReorg: Cross-Modal Mixed Patch Reorganization is a Good Mask Learner
for Open-World Semantic Segmentation [110.09800389100599]
We propose MixReorg, a novel and straightforward pre-training paradigm for semantic segmentation.
Our approach involves generating fine-grained patch-text pairs data by mixing image patches while preserving the correspondence between patches and text.
With MixReorg as a mask learner, conventional text-supervised semantic segmentation models can achieve highly generalizable pixel-semantic alignment ability.
arXiv Detail & Related papers (2023-08-09T09:35:16Z) - R-MAE: Regions Meet Masked Autoencoders [113.73147144125385]
We explore regions as a potential visual analogue of words for self-supervised image representation learning.
Inspired by Masked Autoencoding (MAE), a generative pre-training baseline, we propose masked region autoencoding to learn from groups of pixels or regions.
arXiv Detail & Related papers (2023-06-08T17:56:46Z) - Zero-shot Referring Image Segmentation with Global-Local Context
Features [8.77461711080319]
Referring image segmentation (RIS) aims to find a segmentation mask given a referring expression grounded to a region of the input image.
We propose a simple yet effective zero-shot referring image segmentation method by leveraging the pre-trained cross-modal knowledge from CLIP.
In our experiments, the proposed method outperforms several zero-shot baselines of the task and even the weakly supervised referring expression segmentation method with substantial margins.
arXiv Detail & Related papers (2023-03-31T06:00:50Z) - Towards Effective Image Manipulation Detection with Proposal Contrastive
Learning [61.5469708038966]
We propose Proposal Contrastive Learning (PCL) for effective image manipulation detection.
Our PCL consists of a two-stream architecture by extracting two types of global features from RGB and noise views respectively.
Our PCL can be easily adapted to unlabeled data in practice, which can reduce manual labeling costs and promote more generalizable features.
arXiv Detail & Related papers (2022-10-16T13:30:13Z) - Self-Supervised Visual Representations Learning by Contrastive Mask
Prediction [129.25459808288025]
We propose a novel contrastive mask prediction (CMP) task for visual representation learning.
MaskCo contrasts region-level features instead of view-level features, which makes it possible to identify the positive sample without any assumptions.
We evaluate MaskCo on training datasets beyond ImageNet and compare its performance with MoCo V2.
arXiv Detail & Related papers (2021-08-18T02:50:33Z) - Context-Aware Mixup for Domain Adaptive Semantic Segmentation [52.1935168534351]
Unsupervised domain adaptation (UDA) aims to adapt a model of the labeled source domain to an unlabeled target domain.
We propose end-to-end Context-Aware Mixup (CAMix) for domain adaptive semantic segmentation.
Experimental results show that the proposed method outperforms the state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2021-08-08T03:00:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.