Learning to Segment from Scribbles using Multi-scale Adversarial
Attention Gates
- URL: http://arxiv.org/abs/2007.01152v3
- Date: Thu, 25 Mar 2021 15:54:11 GMT
- Title: Learning to Segment from Scribbles using Multi-scale Adversarial
Attention Gates
- Authors: Gabriele Valvano, Andrea Leo, Sotirios A. Tsaftaris
- Abstract summary: Weakly-supervised learning can train models by relying on weaker forms of annotation, such as scribbles.
We train a multi-scale GAN to generate realistic segmentation masks at multiple resolutions, while we use scribbles to learn their correct position in the image.
Central to the model's success is a novel attention gating mechanism, which we condition with adversarial signals to act as a shape prior.
- Score: 16.28285034098361
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large, fine-grained image segmentation datasets, annotated at pixel-level,
are difficult to obtain, particularly in medical imaging, where annotations
also require expert knowledge. Weakly-supervised learning can train models by
relying on weaker forms of annotation, such as scribbles. Here, we learn to
segment using scribble annotations in an adversarial game. With unpaired
segmentation masks, we train a multi-scale GAN to generate realistic
segmentation masks at multiple resolutions, while we use scribbles to learn
their correct position in the image. Central to the model's success is a novel
attention gating mechanism, which we condition with adversarial signals to act
as a shape prior, resulting in better object localization at multiple scales.
Subject to adversarial conditioning, the segmentor learns attention maps that
are semantic, suppress the noisy activations outside the objects, and reduce
the vanishing gradient problem in the deeper layers of the segmentor. We
evaluated our model on several medical (ACDC, LVSC, CHAOS) and non-medical
(PPSS) datasets, and we report performance levels matching those achieved by
models trained with fully annotated segmentation masks. We also demonstrate
extensions in a variety of settings: semi-supervised learning; combining
multiple scribble sources (a crowdsourcing scenario) and multi-task learning
(combining scribble and mask supervision). We release expert-made scribble
annotations for the ACDC dataset, and the code used for the experiments, at
https://vios-s.github.io/multiscale-adversarial-attention-gates
Related papers
- FreeSeg-Diff: Training-Free Open-Vocabulary Segmentation with Diffusion Models [56.71672127740099]
We focus on the task of image segmentation, which is traditionally solved by training models on closed-vocabulary datasets.
We leverage different and relatively small-sized, open-source foundation models for zero-shot open-vocabulary segmentation.
Our approach (dubbed FreeSeg-Diff), which does not rely on any training, outperforms many training-based approaches on both Pascal VOC and COCO datasets.
arXiv Detail & Related papers (2024-03-29T10:38:25Z) - SLiMe: Segment Like Me [24.254744102347413]
We propose SLiMe to segment images at any desired granularity using as few as one annotated sample.
We carried out a knowledge-rich set of experiments examining various design factors and showed that SLiMe outperforms other existing one-shot and few-shot segmentation methods.
arXiv Detail & Related papers (2023-09-06T17:39:05Z) - MOCA: Self-supervised Representation Learning by Predicting Masked Online Codebook Assignments [72.6405488990753]
Self-supervised learning can be used for mitigating the greedy needs of Vision Transformer networks.
We propose a single-stage and standalone method, MOCA, which unifies both desired properties.
We achieve new state-of-the-art results on low-shot settings and strong experimental results in various evaluation protocols.
arXiv Detail & Related papers (2023-07-18T15:46:20Z) - A Self-Training Framework Based on Multi-Scale Attention Fusion for
Weakly Supervised Semantic Segmentation [7.36778096476552]
We propose a self-training method that utilizes fused multi-scale class-aware attention maps.
We collect information from attention maps of different scales and obtain multi-scale attention maps.
We then apply denoising and reactivation strategies to enhance the potential regions and reduce noisy areas.
arXiv Detail & Related papers (2023-05-10T02:16:12Z) - Self-attention on Multi-Shifted Windows for Scene Segmentation [14.47974086177051]
We explore the effective use of self-attention within multi-scale image windows to learn descriptive visual features.
We propose three different strategies to aggregate these feature maps to decode the feature representation for dense prediction.
Our models achieve very promising performance on four public scene segmentation datasets.
arXiv Detail & Related papers (2022-07-10T07:36:36Z) - Deep Spectral Methods: A Surprisingly Strong Baseline for Unsupervised
Semantic Segmentation and Localization [98.46318529630109]
We take inspiration from traditional spectral segmentation methods by reframing image decomposition as a graph partitioning problem.
We find that these eigenvectors already decompose an image into meaningful segments, and can be readily used to localize objects in a scene.
By clustering the features associated with these segments across a dataset, we can obtain well-delineated, nameable regions.
arXiv Detail & Related papers (2022-05-16T17:47:44Z) - Semantic Segmentation with Generative Models: Semi-Supervised Learning
and Strong Out-of-Domain Generalization [112.68171734288237]
We propose a novel framework for discriminative pixel-level tasks using a generative model of both images and labels.
We learn a generative adversarial network that captures the joint image-label distribution and is trained efficiently using a large set of unlabeled images.
We demonstrate strong in-domain performance compared to several baselines, and are the first to showcase extreme out-of-domain generalization.
arXiv Detail & Related papers (2021-04-12T21:41:25Z) - Group-Wise Semantic Mining for Weakly Supervised Semantic Segmentation [49.90178055521207]
This work addresses weakly supervised semantic segmentation (WSSS), with the goal of bridging the gap between image-level annotations and pixel-level segmentation.
We formulate WSSS as a novel group-wise learning task that explicitly models semantic dependencies in a group of images to estimate more reliable pseudo ground-truths.
In particular, we devise a graph neural network (GNN) for group-wise semantic mining, wherein input images are represented as graph nodes.
arXiv Detail & Related papers (2020-12-09T12:40:13Z) - Deep Active Learning for Joint Classification & Segmentation with Weak
Annotator [22.271760669551817]
CNN visualization and interpretation methods, like class-activation maps (CAMs), are typically used to highlight the image regions linked to class predictions.
We propose an active learning framework, which progressively integrates pixel-level annotations during training.
Our results indicate that, by simply using random sample selection, the proposed approach can significantly outperform state-of-the-art CAMs and AL methods.
arXiv Detail & Related papers (2020-10-10T03:25:54Z) - Semi-supervised few-shot learning for medical image segmentation [21.349705243254423]
Recent attempts to alleviate the need for large annotated datasets have developed training strategies under the few-shot learning paradigm.
We propose a novel few-shot learning framework for semantic segmentation, where unlabeled images are also made available at each episode.
We show that including unlabeled surrogate tasks in the episodic training leads to more powerful feature representations.
arXiv Detail & Related papers (2020-03-18T20:37:18Z) - Improving Few-shot Learning by Spatially-aware Matching and
CrossTransformer [116.46533207849619]
We study the impact of scale and location mismatch in the few-shot learning scenario.
We propose a novel Spatially-aware Matching scheme to effectively perform matching across multiple scales and locations.
arXiv Detail & Related papers (2020-01-06T14:10:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.