Efficient Full Image Interactive Segmentation by Leveraging Within-image
Appearance Similarity
- URL: http://arxiv.org/abs/2007.08173v1
- Date: Thu, 16 Jul 2020 08:21:59 GMT
- Title: Efficient Full Image Interactive Segmentation by Leveraging Within-image
Appearance Similarity
- Authors: Mykhaylo Andriluka, Stefano Pellegrini, Stefan Popov, Vittorio Ferrari
- Abstract summary: We propose a new approach to interactive full-image semantic segmentation.
We leverage a key observation: propagation from labeled to unlabeled pixels does not necessarily require class-specific knowledge.
We build on this observation and propose an approach capable of jointly propagating pixel labels from multiple classes.
- Score: 39.17599924322882
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a new approach to interactive full-image semantic segmentation
which enables quickly collecting training data for new datasets with previously
unseen semantic classes (A demo is available at https://youtu.be/yUk8D5gEX-o).
We leverage a key observation: propagation from labeled to unlabeled pixels
does not necessarily require class-specific knowledge, but can be done purely
based on appearance similarity within an image. We build on this observation
and propose an approach capable of jointly propagating pixel labels from
multiple classes without having explicit class-specific appearance models. To
enable long-range propagation, our approach first globally measures appearance
similarity between labeled and unlabeled pixels across the entire image. Then
it locally integrates per-pixel measurements which improves the accuracy at
boundaries and removes noisy label switches in homogeneous regions. We also
design an efficient manual annotation interface that extends the traditional
polygon drawing tools with a suite of additional convenient features (and add
automatic propagation to it). Experiments with human annotators on the COCO
Panoptic Challenge dataset show that the combination of our better manual
interface and our novel automatic propagation mechanism leads to reducing
annotation time by more than factor of 2x compared to polygon drawing. We also
test our method on the ADE-20k and Fashionista datasets without making any
dataset-specific adaptation nor retraining our model, demonstrating that it can
generalize to new datasets and visual classes.
Related papers
- Improving Human-Object Interaction Detection via Virtual Image Learning [68.56682347374422]
Human-Object Interaction (HOI) detection aims to understand the interactions between humans and objects.
In this paper, we propose to alleviate the impact of such an unbalanced distribution via Virtual Image Leaning (VIL)
A novel label-to-image approach, Multiple Steps Image Creation (MUSIC), is proposed to create a high-quality dataset that has a consistent distribution with real images.
arXiv Detail & Related papers (2023-08-04T10:28:48Z) - CorrMatch: Label Propagation via Correlation Matching for
Semi-Supervised Semantic Segmentation [73.89509052503222]
This paper presents a simple but performant semi-supervised semantic segmentation approach, called CorrMatch.
We observe that the correlation maps not only enable clustering pixels of the same category easily but also contain good shape information.
We propose to conduct pixel propagation by modeling the pairwise similarities of pixels to spread the high-confidence pixels and dig out more.
Then, we perform region propagation to enhance the pseudo labels with accurate class-agnostic masks extracted from the correlation maps.
arXiv Detail & Related papers (2023-06-07T10:02:29Z) - High-fidelity Pseudo-labels for Boosting Weakly-Supervised Segmentation [17.804090651425955]
Image-level weakly-supervised segmentation (WSSS) reduces the usually vast data annotation cost by surrogate segmentation masks during training.
Our work is based on two techniques for improving CAMs; importance sampling, which is a substitute for GAP, and the feature similarity loss.
We reformulate both techniques based on binomial posteriors of multiple independent binary problems.
This has two benefits; their performance is improved and they become more general, resulting in an add-on method that can boost virtually any WSSS method.
arXiv Detail & Related papers (2023-04-05T17:43:57Z) - Multi-dataset Pretraining: A Unified Model for Semantic Segmentation [97.61605021985062]
We propose a unified framework, termed as Multi-Dataset Pretraining, to take full advantage of the fragmented annotations of different datasets.
This is achieved by first pretraining the network via the proposed pixel-to-prototype contrastive loss over multiple datasets.
In order to better model the relationship among images and classes from different datasets, we extend the pixel level embeddings via cross dataset mixing.
arXiv Detail & Related papers (2021-06-08T06:13:11Z) - A Closer Look at Self-training for Zero-Label Semantic Segmentation [53.4488444382874]
Being able to segment unseen classes not observed during training is an important technical challenge in deep learning.
Prior zero-label semantic segmentation works approach this task by learning visual-semantic embeddings or generative models.
We propose a consistency regularizer to filter out noisy pseudo-labels by taking the intersections of the pseudo-labels generated from different augmentations of the same image.
arXiv Detail & Related papers (2021-04-21T14:34:33Z) - Semantic Segmentation with Generative Models: Semi-Supervised Learning
and Strong Out-of-Domain Generalization [112.68171734288237]
We propose a novel framework for discriminative pixel-level tasks using a generative model of both images and labels.
We learn a generative adversarial network that captures the joint image-label distribution and is trained efficiently using a large set of unlabeled images.
We demonstrate strong in-domain performance compared to several baselines, and are the first to showcase extreme out-of-domain generalization.
arXiv Detail & Related papers (2021-04-12T21:41:25Z) - Deep Active Learning for Joint Classification & Segmentation with Weak
Annotator [22.271760669551817]
CNN visualization and interpretation methods, like class-activation maps (CAMs), are typically used to highlight the image regions linked to class predictions.
We propose an active learning framework, which progressively integrates pixel-level annotations during training.
Our results indicate that, by simply using random sample selection, the proposed approach can significantly outperform state-of-the-art CAMs and AL methods.
arXiv Detail & Related papers (2020-10-10T03:25:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.