Active Label Correction for Semantic Segmentation with Foundation Models
- URL: http://arxiv.org/abs/2403.10820v2
- Date: Tue, 4 Jun 2024 13:15:16 GMT
- Title: Active Label Correction for Semantic Segmentation with Foundation Models
- Authors: Hoyoung Kim, Sehyun Hwang, Suha Kwak, Jungseul Ok,
- Abstract summary: We propose an effective framework of active label correction (ALC) based on a design of correction query to rectify pseudo labels of pixels.
Our method comprises two key techniques: (i) an annotator-friendly design of correction query with the pseudo labels, and (ii) an acquisition function looking ahead label expansions based on the superpixels.
Experimental results on PASCAL, Cityscapes, and Kvasir-SEG datasets demonstrate the effectiveness of our ALC framework.
- Score: 34.0733215363568
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Training and validating models for semantic segmentation require datasets with pixel-wise annotations, which are notoriously labor-intensive. Although useful priors such as foundation models or crowdsourced datasets are available, they are error-prone. We hence propose an effective framework of active label correction (ALC) based on a design of correction query to rectify pseudo labels of pixels, which in turn is more annotator-friendly than the standard one inquiring to classify a pixel directly according to our theoretical analysis and user study. Specifically, leveraging foundation models providing useful zero-shot predictions on pseudo labels and superpixels, our method comprises two key techniques: (i) an annotator-friendly design of correction query with the pseudo labels, and (ii) an acquisition function looking ahead label expansions based on the superpixels. Experimental results on PASCAL, Cityscapes, and Kvasir-SEG datasets demonstrate the effectiveness of our ALC framework, outperforming prior methods for active semantic segmentation and label correction. Notably, utilizing our method, we obtained a revised dataset of PASCAL by rectifying errors in 2.6 million pixels in PASCAL dataset.
Related papers
- HPL-ESS: Hybrid Pseudo-Labeling for Unsupervised Event-based Semantic Segmentation [47.271784693700845]
We propose a novel hybrid pseudo-labeling framework for unsupervised event-based semantic segmentation, HPL-ESS, to alleviate the influence of noisy pseudo labels.
Our proposed method outperforms existing state-of-the-art methods by a large margin on the DSEC-Semantic dataset.
arXiv Detail & Related papers (2024-03-25T14:02:33Z) - Semantic Connectivity-Driven Pseudo-labeling for Cross-domain
Segmentation [89.41179071022121]
Self-training is a prevailing approach in cross-domain semantic segmentation.
We propose a novel approach called Semantic Connectivity-driven pseudo-labeling.
This approach formulates pseudo-labels at the connectivity level and thus can facilitate learning structured and low-noise semantics.
arXiv Detail & Related papers (2023-12-11T12:29:51Z) - Top-K Pooling with Patch Contrastive Learning for Weakly-Supervised
Semantic Segmentation [25.628382644404066]
We introduce a novel ViT-based WSSS method named top-K pooling with patch contrastive learning (TKP-PCL)
A patch contrastive error (PCE) is also proposed to enhance the patch embeddings to further improve the final results.
Our approach is very efficient and outperforms other state-of-the-art WSSS methods on the PASCAL 2012 dataset.
arXiv Detail & Related papers (2023-10-15T13:19:59Z) - Estimating label quality and errors in semantic segmentation data via
any model [19.84626033109009]
We study methods to score label quality, such that the images with the lowest scores are least likely to be correctly labeled.
This helps prioritize what data to review in order to ensure a high-quality training/evaluation dataset.
arXiv Detail & Related papers (2023-07-11T07:29:09Z) - Distilling Self-Supervised Vision Transformers for Weakly-Supervised
Few-Shot Classification & Segmentation [58.03255076119459]
We address the task of weakly-supervised few-shot image classification and segmentation, by leveraging a Vision Transformer (ViT)
Our proposed method takes token representations from the self-supervised ViT and leverages their correlations, via self-attention, to produce classification and segmentation predictions.
Experiments on Pascal-5i and COCO-20i demonstrate significant performance gains in a variety of supervision settings.
arXiv Detail & Related papers (2023-07-07T06:16:43Z) - CorrMatch: Label Propagation via Correlation Matching for
Semi-Supervised Semantic Segmentation [73.89509052503222]
This paper presents a simple but performant semi-supervised semantic segmentation approach, called CorrMatch.
We observe that the correlation maps not only enable clustering pixels of the same category easily but also contain good shape information.
We propose to conduct pixel propagation by modeling the pairwise similarities of pixels to spread the high-confidence pixels and dig out more.
Then, we perform region propagation to enhance the pseudo labels with accurate class-agnostic masks extracted from the correlation maps.
arXiv Detail & Related papers (2023-06-07T10:02:29Z) - CAFS: Class Adaptive Framework for Semi-Supervised Semantic Segmentation [5.484296906525601]
Semi-supervised semantic segmentation learns a model for classifying pixels into specific classes using a few labeled samples and numerous unlabeled images.
We propose a class-adaptive semisupervision framework for semi-supervised semantic segmentation (CAFS)
CAFS constructs a validation set on a labeled dataset, to leverage the calibration performance for each class.
arXiv Detail & Related papers (2023-03-21T05:56:53Z) - Dense FixMatch: a simple semi-supervised learning method for pixel-wise
prediction tasks [68.36996813591425]
We propose Dense FixMatch, a simple method for online semi-supervised learning of dense and structured prediction tasks.
We enable the application of FixMatch in semi-supervised learning problems beyond image classification by adding a matching operation on the pseudo-labels.
Dense FixMatch significantly improves results compared to supervised learning using only labeled data, approaching its performance with 1/4 of the labeled samples.
arXiv Detail & Related papers (2022-10-18T15:02:51Z) - Cross-Model Pseudo-Labeling for Semi-Supervised Action Recognition [98.25592165484737]
We propose a more effective pseudo-labeling scheme, called Cross-Model Pseudo-Labeling (CMPL)
CMPL achieves $17.6%$ and $25.1%$ Top-1 accuracy on Kinetics-400 and UCF-101 using only the RGB modality and $1%$ labeled data, respectively.
arXiv Detail & Related papers (2021-12-17T18:59:41Z) - LabOR: Labeling Only if Required for Domain Adaptive Semantic
Segmentation [79.96052264984469]
We propose a human-in-the-loop approach to adaptively give scarce labels to points that a UDA model is uncertain about.
We show the advantages of this new framework for domain adaptive semantic segmentation while minimizing human labor costs.
arXiv Detail & Related papers (2021-08-12T07:35:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.