Visual Semantic Segmentation Based on Few/Zero-Shot Learning: An
Overview
- URL: http://arxiv.org/abs/2211.08352v1
- Date: Sun, 13 Nov 2022 13:39:33 GMT
- Title: Visual Semantic Segmentation Based on Few/Zero-Shot Learning: An
Overview
- Authors: Wenqi Ren, Yang Tang, Qiyu Sun, Chaoqiang Zhao, Qing-Long Han
- Abstract summary: This paper focuses on the recently published few/zero-shot visual semantic segmentation methods varying from 2D to 3D space.
Three typical instantiations are involved to uncover the interactions of few/zero-shot learning with visual semantic segmentation.
- Score: 47.10687511208189
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Visual semantic segmentation aims at separating a visual sample into diverse
blocks with specific semantic attributes and identifying the category for each
block, and it plays a crucial role in environmental perception. Conventional
learning-based visual semantic segmentation approaches count heavily on
large-scale training data with dense annotations and consistently fail to
estimate accurate semantic labels for unseen categories. This obstruction spurs
a craze for studying visual semantic segmentation with the assistance of
few/zero-shot learning. The emergence and rapid progress of few/zero-shot
visual semantic segmentation make it possible to learn unseen-category from a
few labeled or zero-labeled samples, which advances the extension to practical
applications. Therefore, this paper focuses on the recently published
few/zero-shot visual semantic segmentation methods varying from 2D to 3D space
and explores the commonalities and discrepancies of technical settlements under
different segmentation circumstances. Specifically, the preliminaries on
few/zero-shot visual semantic segmentation, including the problem definitions,
typical datasets, and technical remedies, are briefly reviewed and discussed.
Moreover, three typical instantiations are involved to uncover the interactions
of few/zero-shot learning with visual semantic segmentation, including image
semantic segmentation, video object segmentation, and 3D segmentation. Finally,
the future challenges of few/zero-shot visual semantic segmentation are
discussed.
Related papers
- A Bottom-Up Approach to Class-Agnostic Image Segmentation [4.086366531569003]
We present a novel bottom-up formulation for addressing the class-agnostic segmentation problem.
We supervise our network directly on the projective sphere of its feature space.
Our bottom-up formulation exhibits exceptional generalization capability, even when trained on datasets designed for class-based segmentation.
arXiv Detail & Related papers (2024-09-20T17:56:02Z) - Image Segmentation in Foundation Model Era: A Survey [99.19456390358211]
Current research in image segmentation lacks a detailed analysis of distinct characteristics, challenges, and solutions associated with these advancements.
This survey seeks to fill this gap by providing a thorough review of cutting-edge research centered around FM-driven image segmentation.
An exhaustive overview of over 300 segmentation approaches is provided to encapsulate the breadth of current research efforts.
arXiv Detail & Related papers (2024-08-23T10:07:59Z) - Primitive Generation and Semantic-related Alignment for Universal
Zero-Shot Segmentation [13.001629605405954]
We study universal zero-shot segmentation in this work to achieve panoptic, instance, and semantic segmentation for novel categories without any training samples.
We introduce a generative model to synthesize features for unseen categories, which links semantic and visual spaces.
The proposed approach achieves impressively state-of-the-art performance on zero-shot panoptic segmentation, instance segmentation, and semantic segmentation.
arXiv Detail & Related papers (2023-06-19T17:59:16Z) - Exploring Open-Vocabulary Semantic Segmentation without Human Labels [76.15862573035565]
We present ZeroSeg, a novel method that leverages the existing pretrained vision-language model (VL) to train semantic segmentation models.
ZeroSeg overcomes this by distilling the visual concepts learned by VL models into a set of segment tokens, each summarizing a localized region of the target image.
Our approach achieves state-of-the-art performance when compared to other zero-shot segmentation methods under the same training data.
arXiv Detail & Related papers (2023-06-01T08:47:06Z) - Open-world Semantic Segmentation via Contrasting and Clustering
Vision-Language Embedding [95.78002228538841]
We propose a new open-world semantic segmentation pipeline that makes the first attempt to learn to segment semantic objects of various open-world categories without any efforts on dense annotations.
Our method can directly segment objects of arbitrary categories, outperforming zero-shot segmentation methods that require data labeling on three benchmark datasets.
arXiv Detail & Related papers (2022-07-18T09:20:04Z) - TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic
Segmentation [44.75300205362518]
Unsupervised semantic segmentation aims to obtain high-level semantic representation on low-level visual features without manual annotations.
We propose the first top-down unsupervised semantic segmentation framework for fine-grained segmentation in extremely complicated scenarios.
Our results show that our top-down unsupervised segmentation is robust to both object-centric and scene-centric datasets.
arXiv Detail & Related papers (2021-12-02T18:59:03Z) - Robust 3D Scene Segmentation through Hierarchical and Learnable
Part-Fusion [9.275156524109438]
3D semantic segmentation is a fundamental building block for several scene understanding applications such as autonomous driving, robotics and AR/VR.
Previous methods have utilized hierarchical, iterative methods to fuse semantic and instance information, but they lack learnability in context fusion.
This paper presents Segment-Fusion, a novel attention-based method for hierarchical fusion of semantic and instance information.
arXiv Detail & Related papers (2021-11-16T13:14:47Z) - Visual Boundary Knowledge Translation for Foreground Segmentation [57.32522585756404]
We make an attempt towards building models that explicitly account for visual boundary knowledge, in hope to reduce the training effort on segmenting unseen categories.
With only tens of labeled samples as guidance, Trans-Net achieves close results on par with fully supervised methods.
arXiv Detail & Related papers (2021-08-01T07:10:25Z) - Part-aware Prototype Network for Few-shot Semantic Segmentation [50.581647306020095]
We propose a novel few-shot semantic segmentation framework based on the prototype representation.
Our key idea is to decompose the holistic class representation into a set of part-aware prototypes.
We develop a novel graph neural network model to generate and enhance the proposed part-aware prototypes.
arXiv Detail & Related papers (2020-07-13T11:03:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.