Removing supervision in semantic segmentation with local-global matching
and area balancing
- URL: http://arxiv.org/abs/2303.17410v1
- Date: Thu, 30 Mar 2023 14:27:42 GMT
- Title: Removing supervision in semantic segmentation with local-global matching
and area balancing
- Authors: Simone Rossetti (1 and 2), Nico Sam\`a (1), Fiora Pirri (1 and 2) ((1)
DeepPlants, (2) Diag Sapienza)
- Abstract summary: We design a novel end-to-end model leveraging local-global patch matching to predict categories, good localization, area and shape of objects for semantic segmentation.
Our model attains state-of-the-art in Weakly Supervised Semantic, only image-level labels, with 75% mIoU on PascalVOC2012 val set and 46% on MS-COCO2014 val set.
We also attain state-of-the-art on Unsupervised Semantic with 43.6% mIoU on PascalVOC2012 val set and 19.4% on MS-COCO2014 val set.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Removing supervision in semantic segmentation is still tricky. Current
approaches can deal with common categorical patterns yet resort to multi-stage
architectures. We design a novel end-to-end model leveraging local-global patch
matching to predict categories, good localization, area and shape of objects
for semantic segmentation. The local-global matching is, in turn, compelled by
optimal transport plans fulfilling area constraints nearing a solution for
exact shape prediction. Our model attains state-of-the-art in Weakly Supervised
Semantic Segmentation, only image-level labels, with 75% mIoU on PascalVOC2012
val set and 46% on MS-COCO2014 val set. Dropping the image-level labels and
clustering self-supervised learned features to yield pseudo-multi-level labels,
we obtain an unsupervised model for semantic segmentation. We also attain
state-of-the-art on Unsupervised Semantic Segmentation with 43.6% mIoU on
PascalVOC2012 val set and 19.4% on MS-COCO2014 val set. The code is available
at https://github.com/deepplants/PC2M.
Related papers
- HierVL: Semi-Supervised Segmentation leveraging Hierarchical Vision-Language Synergy with Dynamic Text-Spatial Query Alignment [16.926158907882012]
Vision-only methods struggle to generalize, resulting in pixel misclassification between similar classes, poor generalization and boundary localization.<n>We introduce HierVL, a unified framework that bridges this gap by integrating abstract text embeddings into a mask-transformer architecture tailored for semi-supervised segmentation.<n>Our results show that language-guided segmentation closes the label efficiency gap and unlocks new levels of fine-grained, instance-aware generalization.
arXiv Detail & Related papers (2025-06-16T19:05:33Z) - Unsupervised Universal Image Segmentation [59.0383635597103]
We propose an Unsupervised Universal model (U2Seg) adept at performing various image segmentation tasks.
U2Seg generates pseudo semantic labels for these segmentation tasks via leveraging self-supervised models.
We then self-train the model on these pseudo semantic labels, yielding substantial performance gains.
arXiv Detail & Related papers (2023-12-28T18:59:04Z) - A Lightweight Clustering Framework for Unsupervised Semantic
Segmentation [28.907274978550493]
Unsupervised semantic segmentation aims to categorize each pixel in an image into a corresponding class without the use of annotated data.
We propose a lightweight clustering framework for unsupervised semantic segmentation.
Our framework achieves state-of-the-art results on PASCAL VOC and MS COCO datasets.
arXiv Detail & Related papers (2023-11-30T15:33:42Z) - SemiVL: Semi-Supervised Semantic Segmentation with Vision-Language
Guidance [97.00445262074595]
In SemiVL, we propose to integrate rich priors from vision-language models into semi-supervised semantic segmentation.
We design a language-guided decoder to jointly reason over vision and language.
We evaluate SemiVL on 4 semantic segmentation datasets, where it significantly outperforms previous semi-supervised methods.
arXiv Detail & Related papers (2023-11-27T19:00:06Z) - Coupling Global Context and Local Contents for Weakly-Supervised
Semantic Segmentation [54.419401869108846]
We propose a single-stage WeaklySupervised Semantic (WSSS) model with only the image-level class label supervisions.
A flexible context aggregation module is proposed to capture the global object context in different granular spaces.
A semantically consistent feature fusion module is proposed in a bottom-up parameter-learnable fashion to aggregate the fine-grained local contents.
arXiv Detail & Related papers (2023-04-18T15:29:23Z) - CAFS: Class Adaptive Framework for Semi-Supervised Semantic Segmentation [5.484296906525601]
Semi-supervised semantic segmentation learns a model for classifying pixels into specific classes using a few labeled samples and numerous unlabeled images.
We propose a class-adaptive semisupervision framework for semi-supervised semantic segmentation (CAFS)
CAFS constructs a validation set on a labeled dataset, to leverage the calibration performance for each class.
arXiv Detail & Related papers (2023-03-21T05:56:53Z) - Fully Self-Supervised Learning for Semantic Segmentation [46.6602159197283]
We present a fully self-supervised framework for semantic segmentation(FS4).
We propose a bootstrapped training scheme for semantic segmentation, which fully leveraged the global semantic knowledge for self-supervision.
We evaluate our method on the large-scale COCO-Stuff dataset and achieved 7.19 mIoU improvements on both things and stuff objects.
arXiv Detail & Related papers (2022-02-24T09:38:22Z) - Scaling up Multi-domain Semantic Segmentation with Sentence Embeddings [81.09026586111811]
We propose an approach to semantic segmentation that achieves state-of-the-art supervised performance when applied in a zero-shot setting.
This is achieved by replacing each class label with a vector-valued embedding of a short paragraph that describes the class.
The resulting merged semantic segmentation dataset of over 2 Million images enables training a model that achieves performance equal to that of state-of-the-art supervised methods on 7 benchmark datasets.
arXiv Detail & Related papers (2022-02-04T07:19:09Z) - Global Aggregation then Local Distribution for Scene Parsing [99.1095068574454]
We show that our approach can be modularized as an end-to-end trainable block and easily plugged into existing semantic segmentation networks.
Our approach allows us to build new state of the art on major semantic segmentation benchmarks including Cityscapes, ADE20K, Pascal Context, Camvid and COCO-stuff.
arXiv Detail & Related papers (2021-07-28T03:46:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.