Related papers: SASFormer: Transformers for Sparsely Annotated Semantic Segmentation

SASFormer: Transformers for Sparsely Annotated Semantic Segmentation

URL: http://arxiv.org/abs/2212.02019v2
Date: Tue, 6 Dec 2022 16:31:53 GMT
Title: SASFormer: Transformers for Sparsely Annotated Semantic Segmentation
Authors: Hui Su, Yue Ye, Wei Hua, Lechao Cheng, Mingli Song
Abstract summary: We propose a simple yet effective sparse annotated semantic segmentation framework based on segformer, dubbed SASFormer. Specifically, the framework first generates hierarchical patch attention maps, which are then multiplied by the network predictions to produce correlated regions separated by valid labels.
Score: 44.758672633271956
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Semantic segmentation based on sparse annotation has advanced in recent years. It labels only part of each object in the image, leaving the remainder unlabeled. Most of the existing approaches are time-consuming and often necessitate a multi-stage training strategy. In this work, we propose a simple yet effective sparse annotated semantic segmentation framework based on segformer, dubbed SASFormer, that achieves remarkable performance. Specifically, the framework first generates hierarchical patch attention maps, which are then multiplied by the network predictions to produce correlated regions separated by valid labels. Besides, we also introduce the affinity loss to ensure consistency between the features of correlation results and network predictions. Extensive experiments showcase that our proposed approach is superior to existing methods and achieves cutting-edge performance. The source code is available at \url{https://github.com/su-hui-zz/SASFormer}.

Related papers

Frozen CLIP: A Strong Backbone for Weakly Supervised Semantic Segmentation [90.35249276717038]
We propose WeCLIP, a CLIP-based single-stage pipeline, for weakly supervised semantic segmentation. Specifically, the frozen CLIP model is applied as the backbone for semantic feature extraction. A new decoder is designed to interpret extracted semantic features for final prediction.
arXiv Detail & Related papers (2024-06-17T03:49:47Z)
Scribble Hides Class: Promoting Scribble-Based Weakly-Supervised Semantic Segmentation with Its Class Label [16.745019028033518]
We propose a class-driven scribble promotion network, which utilizes both scribble annotations and pseudo-labels informed by image-level classes and global semantics for supervision. Experiments on the ScribbleSup dataset with different qualities of scribble annotations outperform all the previous methods, demonstrating the superiority and robustness of our method.
arXiv Detail & Related papers (2024-02-27T14:51:56Z)
CorrMatch: Label Propagation via Correlation Matching for Semi-Supervised Semantic Segmentation [73.89509052503222]
This paper presents a simple but performant semi-supervised semantic segmentation approach, called CorrMatch. We observe that the correlation maps not only enable clustering pixels of the same category easily but also contain good shape information. We propose to conduct pixel propagation by modeling the pairwise similarities of pixels to spread the high-confidence pixels and dig out more. Then, we perform region propagation to enhance the pseudo labels with accurate class-agnostic masks extracted from the correlation maps.
arXiv Detail & Related papers (2023-06-07T10:02:29Z)
Learning Context-aware Classifier for Semantic Segmentation [88.88198210948426]
In this paper, contextual hints are exploited via learning a context-aware classifier. Our method is model-agnostic and can be easily applied to generic segmentation models. With only negligible additional parameters and +2% inference time, decent performance gain has been achieved on both small and large models.
arXiv Detail & Related papers (2023-03-21T07:00:35Z)
SpanProto: A Two-stage Span-based Prototypical Network for Few-shot Named Entity Recognition [45.012327072558975]
Few-shot Named Entity Recognition (NER) aims to identify named entities with very little annotated data. We propose a seminal span-based prototypical network (SpanProto) that tackles few-shot NER via a two-stage approach. In the span extraction stage, we transform the sequential tags into a global boundary matrix, enabling the model to focus on the explicit boundary information. For mention classification, we leverage prototypical learning to capture the semantic representations for each labeled span and make the model better adapt to novel-class entities.
arXiv Detail & Related papers (2022-10-17T12:59:33Z)
One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation [78.36781565047656]
We propose "One Thing One Click," meaning that the annotator only needs to label one point per object. We iteratively conduct the training and label propagation, facilitated by a graph propagation module. Our results are also comparable to those of the fully supervised counterparts.
arXiv Detail & Related papers (2021-04-06T02:27:25Z)
Few-shot 3D Point Cloud Semantic Segmentation [138.80825169240302]
We propose a novel attention-aware multi-prototype transductive few-shot point cloud semantic segmentation method. Our proposed method shows significant and consistent improvements compared to baselines in different few-shot point cloud semantic segmentation settings.
arXiv Detail & Related papers (2020-06-22T08:05:25Z)
Towards Bounding-Box Free Panoptic Segmentation [16.4548904544277]
We introduce a new Bounding-Box Free Network (BBFNet) for panoptic segmentation. BBFNet predicts coarse watershed levels and uses them to detect large instance candidates where boundaries are well defined. For smaller instances, whose boundaries are less reliable, BBFNet also predicts instance centers by means of Hough voting followed by mean-shift to reliably detect small objects.
arXiv Detail & Related papers (2020-02-18T16:34:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.