Open-World Semantic Segmentation Including Class Similarity
- URL: http://arxiv.org/abs/2403.07532v1
- Date: Tue, 12 Mar 2024 11:11:19 GMT
- Title: Open-World Semantic Segmentation Including Class Similarity
- Authors: Matteo Sodano, Federico Magistri, Lucas Nunes, Jens Behley, Cyrill
Stachniss
- Abstract summary: This paper tackles open-world semantic segmentation, i.e., the variant of interpreting image data in which objects occur that have not been seen during training.
We propose a novel approach that performs accurate closed-world semantic segmentation and can identify new categories without requiring any additional training data.
- Score: 31.799000996671975
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Interpreting camera data is key for autonomously acting systems, such as
autonomous vehicles. Vision systems that operate in real-world environments
must be able to understand their surroundings and need the ability to deal with
novel situations. This paper tackles open-world semantic segmentation, i.e.,
the variant of interpreting image data in which objects occur that have not
been seen during training. We propose a novel approach that performs accurate
closed-world semantic segmentation and, at the same time, can identify new
categories without requiring any additional training data. Our approach
additionally provides a similarity measure for every newly discovered class in
an image to a known category, which can be useful information in downstream
tasks such as planning or mapping. Through extensive experiments, we show that
our model achieves state-of-the-art results on classes known from training data
as well as for anomaly segmentation and can distinguish between different
unknown classes.
Related papers
- Exploiting Contextual Uncertainty of Visual Data for Efficient Training of Deep Models [0.65268245109828]
We introduce the notion of contextual diversity for active learning CDAL.
We propose a data repair algorithm to curate contextually fair data to reduce model bias.
We are working on developing image retrieval system for wildlife camera trap images and reliable warning system for poor quality rural roads.
arXiv Detail & Related papers (2024-11-04T09:43:33Z) - SegPrompt: Boosting Open-world Segmentation via Category-level Prompt
Learning [49.17344010035996]
Open-world instance segmentation (OWIS) models detect unknown objects in a class-agnostic manner.
Previous OWIS approaches completely erase category information during training to keep the model's ability to generalize to unknown objects.
We propose a novel training mechanism termed SegPrompt that uses category information to improve the model's class-agnostic segmentation ability.
arXiv Detail & Related papers (2023-08-12T11:25:39Z) - Exploring Open-Vocabulary Semantic Segmentation without Human Labels [76.15862573035565]
We present ZeroSeg, a novel method that leverages the existing pretrained vision-language model (VL) to train semantic segmentation models.
ZeroSeg overcomes this by distilling the visual concepts learned by VL models into a set of segment tokens, each summarizing a localized region of the target image.
Our approach achieves state-of-the-art performance when compared to other zero-shot segmentation methods under the same training data.
arXiv Detail & Related papers (2023-06-01T08:47:06Z) - Open-world Semantic Segmentation via Contrasting and Clustering
Vision-Language Embedding [95.78002228538841]
We propose a new open-world semantic segmentation pipeline that makes the first attempt to learn to segment semantic objects of various open-world categories without any efforts on dense annotations.
Our method can directly segment objects of arbitrary categories, outperforming zero-shot segmentation methods that require data labeling on three benchmark datasets.
arXiv Detail & Related papers (2022-07-18T09:20:04Z) - SATS: Self-Attention Transfer for Continual Semantic Segmentation [50.51525791240729]
continual semantic segmentation suffers from the same catastrophic forgetting issue as in continual classification learning.
This study proposes to transfer a new type of information relevant to knowledge, i.e. the relationships between elements within each image.
The relationship information can be effectively obtained from the self-attention maps in a Transformer-style segmentation model.
arXiv Detail & Related papers (2022-03-15T06:09:28Z) - Mutual Information-based Disentangled Neural Networks for Classifying
Unseen Categories in Different Domains: Application to Fetal Ultrasound
Imaging [10.504733425082335]
Deep neural networks exhibit limited generalizability across images with different entangled domain features and categorical features.
We propose Mutual Information-based Disentangled Neural Networks (MIDNet), which extract generalizable categorical features to transfer knowledge to unseen categories in a target domain.
We extensively evaluate the proposed method on fetal ultrasound datasets for two different image classification tasks.
arXiv Detail & Related papers (2020-10-30T17:32:18Z) - Region Comparison Network for Interpretable Few-shot Image
Classification [97.97902360117368]
Few-shot image classification has been proposed to effectively use only a limited number of labeled examples to train models for new classes.
We propose a metric learning based method named Region Comparison Network (RCN), which is able to reveal how few-shot learning works.
We also present a new way to generalize the interpretability from the level of tasks to categories.
arXiv Detail & Related papers (2020-09-08T07:29:05Z) - Mining Cross-Image Semantics for Weakly Supervised Semantic Segmentation [128.03739769844736]
Two neural co-attentions are incorporated into the classifier to capture cross-image semantic similarities and differences.
In addition to boosting object pattern learning, the co-attention can leverage context from other related images to improve localization map inference.
Our algorithm sets new state-of-the-arts on all these settings, demonstrating well its efficacy and generalizability.
arXiv Detail & Related papers (2020-07-03T21:53:46Z) - Learning unbiased zero-shot semantic segmentation networks via
transductive transfer [14.55508599873219]
We propose an easy-to-implement transductive approach to alleviate the prediction bias in zero-shot semantic segmentation.
Our method assumes both the source images with full pixel-level labels and unlabeled target images are available during training.
arXiv Detail & Related papers (2020-07-01T14:25:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.