Related papers: Multi-task deep learning for image segmentation using recursive approximation tasks

Multi-task deep learning for image segmentation using recursive approximation tasks

URL: http://arxiv.org/abs/2005.13053v1
Date: Tue, 26 May 2020 21:35:26 GMT
Title: Multi-task deep learning for image segmentation using recursive approximation tasks
Authors: Rihuan Ke, Aur\'elie Bugeau, Nicolas Papadakis, Mark Kirkland, Peter Schuetz, Carola-Bibiane Sch\"onlieb
Abstract summary: Deep neural networks for segmentation usually require a massive amount of pixel-level labels which are manually expensive to create. In this work, we develop a multi-task learning method to relax this constraint. The network is trained on an extremely small amount of precisely segmented images and a large set of coarse labels.
Score: 5.735162284272276
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Fully supervised deep neural networks for segmentation usually require a massive amount of pixel-level labels which are manually expensive to create. In this work, we develop a multi-task learning method to relax this constraint. We regard the segmentation problem as a sequence of approximation subproblems that are recursively defined and in increasing levels of approximation accuracy. The subproblems are handled by a framework that consists of 1) a segmentation task that learns from pixel-level ground truth segmentation masks of a small fraction of the images, 2) a recursive approximation task that conducts partial object regions learning and data-driven mask evolution starting from partial masks of each object instance, and 3) other problem oriented auxiliary tasks that are trained with sparse annotations and promote the learning of dedicated features. Most training images are only labeled by (rough) partial masks, which do not contain exact object boundaries, rather than by their full segmentation masks. During the training phase, the approximation task learns the statistics of these partial masks, and the partial regions are recursively increased towards object boundaries aided by the learned information from the segmentation task in a fully data-driven fashion. The network is trained on an extremely small amount of precisely segmented images and a large set of coarse labels. Annotations can thus be obtained in a cheap way. We demonstrate the efficiency of our approach in three applications with microscopy images and ultrasound images.

Related papers

HiMTok: Learning Hierarchical Mask Tokens for Image Segmentation with Large Multimodal Model [6.641903410779405]
We propose the Hierarchical Mask Tokenizer (HiMTok), which represents segmentation masks with up to 32 tokens. HiMTok allows for compact and coarse-to-fine mask representations, aligning well with the next-token-prediction paradigm. We develop a 3-stage training recipe for progressive learning of segmentation and visual capabilities, featuring a hierarchical mask loss for effective coarse-to-fine learning.
arXiv Detail & Related papers (2025-03-17T10:29:08Z)
Self-Supervised Correction Learning for Semi-Supervised Biomedical Image Segmentation [84.58210297703714]
We propose a self-supervised correction learning paradigm for semi-supervised biomedical image segmentation. We design a dual-task network, including a shared encoder and two independent decoders for segmentation and lesion region inpainting. Experiments on three medical image segmentation datasets for different tasks demonstrate the outstanding performance of our method.
arXiv Detail & Related papers (2023-01-12T08:19:46Z)
Learning to Annotate Part Segmentation with Gradient Matching [58.100715754135685]
This paper focuses on tackling semi-supervised part segmentation tasks by generating high-quality images with a pre-trained GAN. In particular, we formulate the annotator learning as a learning-to-learn problem. We show that our method can learn annotators from a broad range of labelled images including real images, generated images, and even analytically rendered images.
arXiv Detail & Related papers (2022-11-06T01:29:22Z)
Learning to segment from object sizes [0.0]
We propose an algorithm for training a deep segmentation network from a dataset of a few pixel-wise annotated images and many images with known object sizes. The algorithm minimizes a discrete (non-differentiable) loss function defined over the object sizes by sampling the gradient and then using the standard back-propagation algorithm.
arXiv Detail & Related papers (2022-07-01T09:34:44Z)
Discovering Object Masks with Transformers for Unsupervised Semantic Segmentation [75.00151934315967]
MaskDistill is a novel framework for unsupervised semantic segmentation. Our framework does not latch onto low-level image cues and is not limited to object-centric datasets.
arXiv Detail & Related papers (2022-06-13T17:59:43Z)
What You See is What You Classify: Black Box Attributions [61.998683569022006]
We train a deep network, the Explainer, to predict attributions for a pre-trained black-box classifier, the Explanandum. Unlike most existing approaches, ours is capable of directly generating very distinct class-specific masks. We show that our attributions are superior to established methods both visually and quantitatively.
arXiv Detail & Related papers (2022-05-23T12:30:04Z)
Deep Spectral Methods: A Surprisingly Strong Baseline for Unsupervised Semantic Segmentation and Localization [98.46318529630109]
We take inspiration from traditional spectral segmentation methods by reframing image decomposition as a graph partitioning problem. We find that these eigenvectors already decompose an image into meaningful segments, and can be readily used to localize objects in a scene. By clustering the features associated with these segments across a dataset, we can obtain well-delineated, nameable regions.
arXiv Detail & Related papers (2022-05-16T17:47:44Z)
Semantic Segmentation In-the-Wild Without Seeing Any Segmentation Examples [34.97652735163338]
We propose a novel approach for creating semantic segmentation masks for every object. Our method takes as input the image-level labels of the class categories present in the image. The output of this stage provides pixel-level pseudo-labels, instead of the manual pixel-level labels required by supervised methods.
arXiv Detail & Related papers (2021-12-06T17:32:38Z)
Unsupervised Image Segmentation by Mutual Information Maximization and Adversarial Regularization [7.165364364478119]
We propose a novel fully unsupervised semantic segmentation method, the so-called Information Maximization and Adrial Regularization (InMARS) Inspired by human perception which parses a scene into perceptual groups, our proposed approach first partitions an input image into meaningful regions (also known as superpixels) Next, it utilizes Mutual-Information-Maximization followed by an adversarial training strategy to cluster these regions into semantically meaningful classes. Our experiments demonstrate that our method achieves the state-of-the-art performance on two commonly used unsupervised semantic segmentation datasets.
arXiv Detail & Related papers (2021-07-01T18:36:27Z)
Three Ways to Improve Semantic Segmentation with Self-Supervised Depth Estimation [90.87105131054419]
We present a framework for semi-supervised semantic segmentation, which is enhanced by self-supervised monocular depth estimation from unlabeled image sequences. We validate the proposed model on the Cityscapes dataset, where all three modules demonstrate significant performance gains.
arXiv Detail & Related papers (2020-12-19T21:18:03Z)
AinnoSeg: Panoramic Segmentation with High Perfomance [4.867465475957119]
Current panoramic segmentation algorithms are more concerned with context semantics, but the details of image are not processed enough. Aiming to address these issues, this paper presents some useful tricks. All these operations named AinnoSeg, AinnoSeg can achieve state-of-art performance on the well-known dataset ADE20K.
arXiv Detail & Related papers (2020-07-21T04:16:46Z)
Revisiting Sequence-to-Sequence Video Object Segmentation with Multi-Task Loss and Skip-Memory [4.343892430915579]
Video Object (VOS) is an active research area of the visual domain. Current approaches lose objects in longer sequences, especially when the object is small or briefly occluded. We build upon a sequence-to-sequence approach that employs an encoder-decoder architecture together with a memory module for exploiting the sequential data.
arXiv Detail & Related papers (2020-04-25T15:38:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.