Learning unbiased zero-shot semantic segmentation networks via
transductive transfer
- URL: http://arxiv.org/abs/2007.00515v1
- Date: Wed, 1 Jul 2020 14:25:13 GMT
- Title: Learning unbiased zero-shot semantic segmentation networks via
transductive transfer
- Authors: Haiyang Liu, Yichen Wang, Jiayi Zhao, Guowu Yang, Fengmao Lv
- Abstract summary: We propose an easy-to-implement transductive approach to alleviate the prediction bias in zero-shot semantic segmentation.
Our method assumes both the source images with full pixel-level labels and unlabeled target images are available during training.
- Score: 14.55508599873219
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Semantic segmentation, which aims to acquire a detailed understanding of
images, is an essential issue in computer vision. However, in practical
scenarios, new categories that are different from the categories in training
usually appear. Since it is impractical to collect labeled data for all
categories, how to conduct zero-shot learning in semantic segmentation
establishes an important problem. Although the attribute embedding of
categories can promote effective knowledge transfer across different
categories, the prediction of segmentation network reveals obvious bias to seen
categories. In this paper, we propose an easy-to-implement transductive
approach to alleviate the prediction bias in zero-shot semantic segmentation.
Our method assumes that both the source images with full pixel-level labels and
unlabeled target images are available during training. To be specific, the
source images are used to learn the relationship between visual images and
semantic embeddings, while the target images are used to alleviate the
prediction bias towards seen categories. We conduct comprehensive experiments
on diverse split s of the PASCAL dataset. The experimental results clearly
demonstrate the effectiveness of our method.
Related papers
- Fine-grained Recognition with Learnable Semantic Data Augmentation [68.48892326854494]
Fine-grained image recognition is a longstanding computer vision challenge.
We propose diversifying the training data at the feature-level to alleviate the discriminative region loss problem.
Our method significantly improves the generalization performance on several popular classification networks.
arXiv Detail & Related papers (2023-09-01T11:15:50Z) - Open-world Semantic Segmentation via Contrasting and Clustering
Vision-Language Embedding [95.78002228538841]
We propose a new open-world semantic segmentation pipeline that makes the first attempt to learn to segment semantic objects of various open-world categories without any efforts on dense annotations.
Our method can directly segment objects of arbitrary categories, outperforming zero-shot segmentation methods that require data labeling on three benchmark datasets.
arXiv Detail & Related papers (2022-07-18T09:20:04Z) - Hierarchical Semantic Segmentation using Psychometric Learning [17.417302703539367]
We develop a novel approach to collect segmentation annotations from experts based on psychometric testing.
Our method consists of the psychometric testing procedure, active query selection, query enhancement, and a deep metric learning model.
We show the merits of our method with evaluation on the synthetically generated image, aerial image and histology image.
arXiv Detail & Related papers (2021-07-07T13:38:33Z) - Task-Independent Knowledge Makes for Transferable Representations for
Generalized Zero-Shot Learning [77.0715029826957]
Generalized Zero-Shot Learning (GZSL) targets recognizing new categories by learning transferable image representations.
We propose a novel Dual-Contrastive Embedding Network (DCEN) that simultaneously learns task-specific and task-independent knowledge.
arXiv Detail & Related papers (2021-04-05T10:05:48Z) - Grafit: Learning fine-grained image representations with coarse labels [114.17782143848315]
This paper tackles the problem of learning a finer representation than the one provided by training labels.
By jointly leveraging the coarse labels and the underlying fine-grained latent space, it significantly improves the accuracy of category-level retrieval methods.
arXiv Detail & Related papers (2020-11-25T19:06:26Z) - Mining Cross-Image Semantics for Weakly Supervised Semantic Segmentation [128.03739769844736]
Two neural co-attentions are incorporated into the classifier to capture cross-image semantic similarities and differences.
In addition to boosting object pattern learning, the co-attention can leverage context from other related images to improve localization map inference.
Our algorithm sets new state-of-the-arts on all these settings, demonstrating well its efficacy and generalizability.
arXiv Detail & Related papers (2020-07-03T21:53:46Z) - Hierarchical Image Classification using Entailment Cone Embeddings [68.82490011036263]
We first inject label-hierarchy knowledge into an arbitrary CNN-based classifier.
We empirically show that availability of such external semantic information in conjunction with the visual semantics from images boosts overall performance.
arXiv Detail & Related papers (2020-04-02T10:22:02Z) - Realizing Pixel-Level Semantic Learning in Complex Driving Scenes based
on Only One Annotated Pixel per Class [17.481116352112682]
We propose a new semantic segmentation task under complex driving scenes based on weakly supervised condition.
A three step process is built for pseudo labels generation, which progressively implement optimal feature representation for each category.
Experiments on Cityscapes dataset demonstrate that the proposed method provides a feasible way to solve weakly supervised semantic segmentation task.
arXiv Detail & Related papers (2020-03-10T12:57:55Z) - Relevance Prediction from Eye-movements Using Semi-interpretable
Convolutional Neural Networks [9.007191808968242]
We propose an image-classification method to predict the perceived-relevance of text documents from eye-movements.
An eye-tracking study was conducted where participants read short news articles, and rated them as relevant or irrelevant for answering a trigger question.
We encode participants' eye-movement scanpaths as images, and then train a convolutional neural network classifier using these scanpath images.
arXiv Detail & Related papers (2020-01-15T07:02:14Z) - Discovering Latent Classes for Semi-Supervised Semantic Segmentation [18.5909667833129]
This paper studies the problem of semi-supervised semantic segmentation.
We learn latent classes consistent with semantic classes on labeled images.
We show that the proposed method achieves state of the art results for semi-supervised semantic segmentation.
arXiv Detail & Related papers (2019-12-30T14:16:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.