Beyond pixel-wise supervision for segmentation: A few global shape
descriptors might be surprisingly good!
- URL: http://arxiv.org/abs/2105.00859v1
- Date: Mon, 3 May 2021 13:44:36 GMT
- Title: Beyond pixel-wise supervision for segmentation: A few global shape
descriptors might be surprisingly good!
- Authors: Hoel Kervadec and Houda Bahig and Laurent Letourneau-Guillon and Jose
Dolz and Ismail Ben Ayed
- Abstract summary: Standard losses for training deep segmentation networks could be seen as individual classifications of pixels, instead of supervising the global shape of the predicted segmentations.
This study investigates how effective global geometric shape descriptors could be, when used on their own as segmentation losses for training deep networks.
- Score: 16.293620755563854
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Standard losses for training deep segmentation networks could be seen as
individual classifications of pixels, instead of supervising the global shape
of the predicted segmentations. While effective, they require exact knowledge
of the label of each pixel in an image.
This study investigates how effective global geometric shape descriptors
could be, when used on their own as segmentation losses for training deep
networks. Not only interesting theoretically, there exist deeper motivations to
posing segmentation problems as a reconstruction of shape descriptors:
Annotations to obtain approximations of low-order shape moments could be much
less cumbersome than their full-mask counterparts, and anatomical priors could
be readily encoded into invariant shape descriptions, which might alleviate the
annotation burden. Also, and most importantly, we hypothesize that, given a
task, certain shape descriptions might be invariant across image acquisition
protocols/modalities and subject populations, which might open interesting
research avenues for generalization in medical image segmentation.
We introduce and formulate a few shape descriptors in the context of deep
segmentation, and evaluate their potential as standalone losses on two
different challenging tasks. Inspired by recent works in constrained
optimization for deep networks, we propose a way to use those descriptors to
supervise segmentation, without any pixel-level label. Very surprisingly, as
little as 4 descriptors values per class can approach the performance of a
segmentation mask with 65k individual discrete labels. We also found that shape
descriptors can be a valid way to encode anatomical priors about the task,
enabling to leverage expert knowledge without additional annotations. Our
implementation is publicly available and can be easily extended to other tasks
and descriptors: https://github.com/hkervadec/shape_descriptors
Related papers
- Towards Open-Vocabulary Semantic Segmentation Without Semantic Labels [53.8817160001038]
We propose a novel method, PixelCLIP, to adapt the CLIP image encoder for pixel-level understanding.
To address the challenges of leveraging masks without semantic labels, we devise an online clustering algorithm.
PixelCLIP shows significant performance improvements over CLIP and competitive results compared to caption-supervised methods.
arXiv Detail & Related papers (2024-09-30T01:13:03Z) - A Point-Neighborhood Learning Framework for Nasal Endoscope Image Segmentation [43.0260204534598]
We propose a weakly semi-supervised method called Point-Neighborhood Learning (PNL) framework.
To mine the prior of the pixels surrounding the annotated point, we transform a single-point annotation into a circular area named a point-neighborhood.
Our method greatly improves performance without changing the structure of segmentation network.
arXiv Detail & Related papers (2024-05-30T13:25:25Z) - Unsupervised Universal Image Segmentation [59.0383635597103]
We propose an Unsupervised Universal model (U2Seg) adept at performing various image segmentation tasks.
U2Seg generates pseudo semantic labels for these segmentation tasks via leveraging self-supervised models.
We then self-train the model on these pseudo semantic labels, yielding substantial performance gains.
arXiv Detail & Related papers (2023-12-28T18:59:04Z) - Learning Semantic Segmentation with Query Points Supervision on Aerial Images [57.09251327650334]
We present a weakly supervised learning algorithm to train semantic segmentation algorithms.
Our proposed approach performs accurate semantic segmentation and improves efficiency by significantly reducing the cost and time required for manual annotation.
arXiv Detail & Related papers (2023-09-11T14:32:04Z) - Learning to segment from object sizes [0.0]
We propose an algorithm for training a deep segmentation network from a dataset of a few pixel-wise annotated images and many images with known object sizes.
The algorithm minimizes a discrete (non-differentiable) loss function defined over the object sizes by sampling the gradient and then using the standard back-propagation algorithm.
arXiv Detail & Related papers (2022-07-01T09:34:44Z) - Segmenter: Transformer for Semantic Segmentation [79.9887988699159]
We introduce Segmenter, a transformer model for semantic segmentation.
We build on the recent Vision Transformer (ViT) and extend it to semantic segmentation.
It outperforms the state of the art on the challenging ADE20K dataset and performs on-par on Pascal Context and Cityscapes.
arXiv Detail & Related papers (2021-05-12T13:01:44Z) - Sparse Object-level Supervision for Instance Segmentation with Pixel
Embeddings [4.038011160363972]
Most state-of-the-art instance segmentation methods have to be trained on densely annotated images.
We propose a proposal-free segmentation approach based on non-spatial embeddings.
We evaluate the proposed method on challenging 2D and 3D segmentation problems in different microscopy modalities.
arXiv Detail & Related papers (2021-03-26T16:36:56Z) - Semantically Meaningful Class Prototype Learning for One-Shot Image
Semantic Segmentation [58.96902899546075]
One-shot semantic image segmentation aims to segment the object regions for the novel class with only one annotated image.
Recent works adopt the episodic training strategy to mimic the expected situation at testing time.
We propose to leverage the multi-class label information during the episodic training. It will encourage the network to generate more semantically meaningful features for each category.
arXiv Detail & Related papers (2021-02-22T12:07:35Z) - Towards Efficient Scene Understanding via Squeeze Reasoning [71.1139549949694]
We propose a novel framework called Squeeze Reasoning.
Instead of propagating information on the spatial map, we first learn to squeeze the input feature into a channel-wise global vector.
We show that our approach can be modularized as an end-to-end trained block and can be easily plugged into existing networks.
arXiv Detail & Related papers (2020-11-06T12:17:01Z) - Shape-aware Semi-supervised 3D Semantic Segmentation for Medical Images [24.216869988183092]
We propose a shapeaware semi-supervised segmentation strategy to leverage abundant unlabeled data and to enforce a geometric shape constraint on the segmentation output.
We develop a multi-task deep network that jointly predicts semantic segmentation and signed distance mapDM) of object surfaces.
Experiments show that our method outperforms current state-of-the-art approaches with improved shape estimation.
arXiv Detail & Related papers (2020-07-21T11:44:52Z) - Few-Shot Semantic Segmentation Augmented with Image-Level Weak
Annotations [23.02986307143718]
Recent progress in fewshot semantic segmentation tackles the issue by only a few pixel-level annotated examples.
Our key idea is to learn a better prototype representation of the class by fusing the knowledge from the image-level labeled data.
We propose a new framework, called PAIA, to learn the class prototype representation in a metric space by integrating image-level annotations.
arXiv Detail & Related papers (2020-07-03T04:58:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.