One model to use them all: Training a segmentation model with complementary datasets
- URL: http://arxiv.org/abs/2402.19340v2
- Date: Fri, 5 Apr 2024 12:49:38 GMT
- Title: One model to use them all: Training a segmentation model with complementary datasets
- Authors: Alexander C. Jenke, Sebastian Bodenstedt, Fiona R. Kolbinger, Marius Distler, Jürgen Weitz, Stefanie Speidel,
- Abstract summary: We propose a method to combine partially annotated datasets, which provide complementary annotations, into one model.
Our approach successfully combines 6 classes into one model, increasing the overall Dice Score by 4.4%.
By including information on multiple classes, we were able to reduce confusion between stomach and colon by 24%.
- Score: 38.73145509617609
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Understanding a surgical scene is crucial for computer-assisted surgery systems to provide any intelligent assistance functionality. One way of achieving this scene understanding is via scene segmentation, where every pixel of a frame is classified and therefore identifies the visible structures and tissues. Progress on fully segmenting surgical scenes has been made using machine learning. However, such models require large amounts of annotated training data, containing examples of all relevant object classes. Such fully annotated datasets are hard to create, as every pixel in a frame needs to be annotated by medical experts and, therefore, are rarely available. In this work, we propose a method to combine multiple partially annotated datasets, which provide complementary annotations, into one model, enabling better scene segmentation and the use of multiple readily available datasets. Our method aims to combine available data with complementary labels by leveraging mutual exclusive properties to maximize information. Specifically, we propose to use positive annotations of other classes as negative samples and to exclude background pixels of binary annotations, as we cannot tell if they contain a class not annotated but predicted by the model. We evaluate our method by training a DeepLabV3 on the publicly available Dresden Surgical Anatomy Dataset, which provides multiple subsets of binary segmented anatomical structures. Our approach successfully combines 6 classes into one model, increasing the overall Dice Score by 4.4% compared to an ensemble of models trained on the classes individually. By including information on multiple classes, we were able to reduce confusion between stomach and colon by 24%. Our results demonstrate the feasibility of training a model on multiple datasets. This paves the way for future work further alleviating the need for one large, fully segmented datasets.
Related papers
- Segment Together: A Versatile Paradigm for Semi-Supervised Medical Image
Segmentation [17.69933345468061]
scarcity has become a major obstacle for training powerful deep-learning models for medical image segmentation.
We introduce a textbfVersatile textbfSemi-supervised framework to exploit more unlabeled data for semi-supervised medical image segmentation.
arXiv Detail & Related papers (2023-11-20T11:35:52Z) - DatasetDM: Synthesizing Data with Perception Annotations Using Diffusion
Models [61.906934570771256]
We present a generic dataset generation model that can produce diverse synthetic images and perception annotations.
Our method builds upon the pre-trained diffusion model and extends text-guided image synthesis to perception data generation.
We show that the rich latent code of the diffusion model can be effectively decoded as accurate perception annotations using a decoder module.
arXiv Detail & Related papers (2023-08-11T14:38:11Z) - Semantic-SAM: Segment and Recognize Anything at Any Granularity [83.64686655044765]
We introduce Semantic-SAM, a universal image segmentation model to enable segment and recognize anything at any desired granularity.
We consolidate multiple datasets across three granularities and introduce decoupled classification for objects and parts.
For the multi-granularity capability, we propose a multi-choice learning scheme during training, enabling each click to generate masks at multiple levels.
arXiv Detail & Related papers (2023-07-10T17:59:40Z) - Diffusion Models for Open-Vocabulary Segmentation [79.02153797465324]
OVDiff is a novel method that leverages generative text-to-image diffusion models for unsupervised open-vocabulary segmentation.
It relies solely on pre-trained components and outputs the synthesised segmenter directly, without training.
arXiv Detail & Related papers (2023-06-15T17:51:28Z) - SegViz: A Federated Learning Framework for Medical Image Segmentation
from Distributed Datasets with Different and Incomplete Annotations [3.6704226968275258]
We developed SegViz, a learning framework for aggregating knowledge from distributed medical image segmentation datasets.
SegViz was trained to build a model capable of segmenting both liver and spleen aggregating knowledge from both these nodes.
Our results demonstrate SegViz as an essential first step towards training clinically translatable multi-task segmentation models.
arXiv Detail & Related papers (2023-01-17T18:36:57Z) - Universal Segmentation of 33 Anatomies [19.194539991903593]
We present an approach for learning a single model that universally segments 33 anatomical structures.
We learn such a model from a union of multiple datasets, with each dataset containing the images that are partially labeled.
We evaluate our model on multiple open-source datasets, proving that our model has a good generalization performance.
arXiv Detail & Related papers (2022-03-04T02:29:54Z) - MSeg: A Composite Dataset for Multi-domain Semantic Segmentation [100.17755160696939]
We present MSeg, a composite dataset that unifies semantic segmentation datasets from different domains.
We reconcile the generalization and bring the pixel-level annotations into alignment by relabeling more than 220,000 object masks in more than 80,000 images.
A model trained on MSeg ranks first on the WildDash-v1 leaderboard for robust semantic segmentation, with no exposure to WildDash data during training.
arXiv Detail & Related papers (2021-12-27T16:16:35Z) - Multi-dataset Pretraining: A Unified Model for Semantic Segmentation [97.61605021985062]
We propose a unified framework, termed as Multi-Dataset Pretraining, to take full advantage of the fragmented annotations of different datasets.
This is achieved by first pretraining the network via the proposed pixel-to-prototype contrastive loss over multiple datasets.
In order to better model the relationship among images and classes from different datasets, we extend the pixel level embeddings via cross dataset mixing.
arXiv Detail & Related papers (2021-06-08T06:13:11Z) - Training CNN Classifiers for Semantic Segmentation using Partially
Annotated Images: with Application on Human Thigh and Calf MRI [0.0]
We propose a set of strategies to train one single classifier in segmenting all label classes that are heterogeneously annotated across multiple datasets.
We show that presence masking is capable of significantly improving both training and inference efficiency across imaging modalities and anatomical regions.
arXiv Detail & Related papers (2020-08-16T23:38:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.