Boundary-Refined Prototype Generation: A General End-to-End Paradigm for Semi-Supervised Semantic Segmentation
- URL: http://arxiv.org/abs/2307.10097v2
- Date: Sat, 14 Sep 2024 10:51:41 GMT
- Title: Boundary-Refined Prototype Generation: A General End-to-End Paradigm for Semi-Supervised Semantic Segmentation
- Authors: Junhao Dong, Zhu Meng, Delong Liu, Jiaxuan Liu, Zhicheng Zhao, Fei Su,
- Abstract summary: Semi-supervised semantic segmentation has attracted increasing attention in computer vision.
Current approaches isolate prototype generation from the main training framework.
We propose a novel end-to-end boundary-refined prototype generation (BRPG) method.
- Score: 23.00156170789867
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Semi-supervised semantic segmentation has attracted increasing attention in computer vision, aiming to leverage unlabeled data through latent supervision. To achieve this goal, prototype-based classification has been introduced and achieved lots of success. However, the current approaches isolate prototype generation from the main training framework, presenting a non-end-to-end workflow. Furthermore, most methods directly perform the K-Means clustering on features to generate prototypes, resulting in their proximity to category semantic centers, while overlooking the clear delineation of class boundaries. To address the above problems, we propose a novel end-to-end boundary-refined prototype generation (BRPG) method. Specifically, we perform online clustering on sampled features to incorporate the prototype generation into the whole training framework. In addition, to enhance the classification boundaries, we sample and cluster high- and low-confidence features separately based on confidence estimation, facilitating the generation of prototypes closer to the class boundaries. Moreover, an adaptive prototype optimization strategy is proposed to increase the number of prototypes for categories with scattered feature distributions, which further refines the class boundaries. Extensive experiments demonstrate the remarkable robustness and scalability of our method across diverse datasets, segmentation networks, and semi-supervised frameworks, outperforming the state-of-the-art approaches on three benchmark datasets: PASCAL VOC 2012, Cityscapes and MS COCO. The code is available at https://github.com/djh-dzxw/BRPG.
Related papers
- Unsupervised Multimodal Clustering for Semantics Discovery in Multimodal Utterances [24.142013877384603]
This paper introduces a novel unsupervised multimodal clustering method (UMC), making a pioneering contribution to this field.
UMC introduces a unique approach to constructing augmentation views for multimodal data, which are then used to perform pre-training.
We show remarkable improvements of 2-6% scores in clustering metrics over state-of-the-art methods, marking the first successful endeavor in this domain.
arXiv Detail & Related papers (2024-05-21T13:24:07Z) - Beyond Known Clusters: Probe New Prototypes for Efficient Generalized Class Discovery [23.359450657842686]
Generalized Class Discovery (GCD) aims to dynamically assign labels to unlabelled data partially based on knowledge learned from labelled data.
We propose an adaptive probing mechanism that introduces learnable potential prototypes to expand cluster prototypes.
Our method surpasses the nearest competitor by a significant margin of 9.7% within the Stanford Cars dataset.
arXiv Detail & Related papers (2024-04-13T12:41:40Z) - Rethinking Few-shot 3D Point Cloud Semantic Segmentation [62.80639841429669]
This paper revisits few-shot 3D point cloud semantic segmentation (FS-PCS)
We focus on two significant issues in the state-of-the-art: foreground leakage and sparse point distribution.
To address these issues, we introduce a standardized FS-PCS setting, upon which a new benchmark is built.
arXiv Detail & Related papers (2024-03-01T15:14:47Z) - Unsupervised Prototype Adapter for Vision-Language Models [29.516767588241724]
We design an unsupervised fine-tuning approach for vision-language models called Unsupervised Prototype Adapter (UP-Adapter)
Specifically, for the unannotated target datasets, we leverage the text-image aligning capability of CLIP to automatically select the most confident samples for each class.
After fine-tuning, the prototype model prediction is combined with the original CLIP's prediction by a residual connection to perform downstream recognition tasks.
arXiv Detail & Related papers (2023-08-22T15:28:49Z) - Harmonizing Base and Novel Classes: A Class-Contrastive Approach for
Generalized Few-Shot Segmentation [78.74340676536441]
We propose a class contrastive loss and a class relationship loss to regulate prototype updates and encourage a large distance between prototypes.
Our proposed approach achieves new state-of-the-art performance for the generalized few-shot segmentation task on PASCAL VOC and MS COCO datasets.
arXiv Detail & Related papers (2023-03-24T00:30:25Z) - Semi-supervised Semantic Segmentation with Prototype-based Consistency
Regularization [20.4183741427867]
Semi-supervised semantic segmentation requires the model to propagate the label information from limited annotated images to unlabeled ones.
A challenge for such a per-pixel prediction task is the large intra-class variation.
We propose a novel approach to regularize the distribution of within-class features to ease label propagation difficulty.
arXiv Detail & Related papers (2022-10-10T01:38:01Z) - Rethinking Clustering-Based Pseudo-Labeling for Unsupervised
Meta-Learning [146.11600461034746]
Method for unsupervised meta-learning, CACTUs, is a clustering-based approach with pseudo-labeling.
This approach is model-agnostic and can be combined with supervised algorithms to learn from unlabeled data.
We prove that the core reason for this is lack of a clustering-friendly property in the embedding space.
arXiv Detail & Related papers (2022-09-27T19:04:36Z) - Weakly Supervised Semantic Segmentation via Progressive Patch Learning [39.87150496277798]
"Progressive Patch Learning" approach is proposed to improve the local details extraction of the classification.
"Patch Learning" destructs the feature maps into patches and independently processes each local patch in parallel before the final aggregation.
"Progressive Patch Learning" further extends the feature destruction and patch learning to multi-level granularities in a progressive manner.
arXiv Detail & Related papers (2022-09-16T09:54:17Z) - Beyond the Prototype: Divide-and-conquer Proxies for Few-shot
Segmentation [63.910211095033596]
Few-shot segmentation aims to segment unseen-class objects given only a handful of densely labeled samples.
We propose a simple yet versatile framework in the spirit of divide-and-conquer.
Our proposed approach, named divide-and-conquer proxies (DCP), allows for the development of appropriate and reliable information.
arXiv Detail & Related papers (2022-04-21T06:21:14Z) - BMD: A General Class-balanced Multicentric Dynamic Prototype Strategy
for Source-free Domain Adaptation [74.93176783541332]
Source-free Domain Adaptation (SFDA) aims to adapt a pre-trained source model to the unlabeled target domain without accessing the well-labeled source data.
To make up for the absence of source data, most existing methods introduced feature prototype based pseudo-labeling strategies.
We propose a general class-Balanced Multicentric Dynamic prototype strategy for the SFDA task.
arXiv Detail & Related papers (2022-04-06T13:23:02Z) - Dual Prototypical Contrastive Learning for Few-shot Semantic
Segmentation [55.339405417090084]
We propose a dual prototypical contrastive learning approach tailored to the few-shot semantic segmentation (FSS) task.
The main idea is to encourage the prototypes more discriminative by increasing inter-class distance while reducing intra-class distance in prototype feature space.
We demonstrate that the proposed dual contrastive learning approach outperforms state-of-the-art FSS methods on PASCAL-5i and COCO-20i datasets.
arXiv Detail & Related papers (2021-11-09T08:14:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.