Label fusion and training methods for reliable representation of
inter-rater uncertainty
- URL: http://arxiv.org/abs/2202.07550v1
- Date: Tue, 15 Feb 2022 16:35:47 GMT
- Title: Label fusion and training methods for reliable representation of
inter-rater uncertainty
- Authors: Andreanne Lemay, Charley Gros, Julien Cohen-Adad
- Abstract summary: Training deep learning networks with annotations from multiple raters mitigates the model's bias towards a single expert.
Various methods exist to take into account different expert labels.
We compare three label fusion methods: STAPLE, average of the rater's segmentation, and random sampling each rater's segmentation during training.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Medical tasks are prone to inter-rater variability due to multiple factors
such as image quality, professional experience and training, or guideline
clarity. Training deep learning networks with annotations from multiple raters
is a common practice that mitigates the model's bias towards a single expert.
Reliable models generating calibrated outputs and reflecting the inter-rater
disagreement are key to the integration of artificial intelligence in clinical
practice. Various methods exist to take into account different expert labels.
We focus on comparing three label fusion methods: STAPLE, average of the
rater's segmentation, and random sampling each rater's segmentation during
training. Each label fusion method is studied using the conventional training
framework or the recently published SoftSeg framework that limits information
loss by treating the segmentation task as a regression. Our results, across 10
data splittings on two public datasets, indicate that SoftSeg models,
regardless of the ground truth fusion method, had better calibration and
preservation of the inter-rater rater variability compared with their
conventional counterparts without impacting the segmentation performance.
Conventional models, i.e., trained with a Dice loss, with binary inputs, and
sigmoid/softmax final activate, were overconfident and underestimated the
uncertainty associated with inter-rater variability. Conversely, fusing labels
by averaging with the SoftSeg framework led to underconfident outputs and
overestimation of the rater disagreement. In terms of segmentation performance,
the best label fusion method was different for the two datasets studied,
indicating this parameter might be task-dependent. However, SoftSeg had
segmentation performance systematically superior or equal to the conventionally
trained models and had the best calibration and preservation of the inter-rater
variability.
Related papers
- Dynamic Correlation Learning and Regularization for Multi-Label Confidence Calibration [60.95748658638956]
This paper introduces the Multi-Label Confidence task, aiming to provide well-calibrated confidence scores in multi-label scenarios.
Existing single-label calibration methods fail to account for category correlations, which are crucial for addressing semantic confusion.
We propose the Dynamic Correlation Learning and Regularization algorithm, which leverages multi-grained semantic correlations to better model semantic confusion.
arXiv Detail & Related papers (2024-07-09T13:26:21Z) - CrossMatch: Enhance Semi-Supervised Medical Image Segmentation with Perturbation Strategies and Knowledge Distillation [7.6057981800052845]
CrossMatch is a novel framework that integrates knowledge distillation with dual strategies-image-level and feature-level to improve the model's learning from both labeled and unlabeled data.
Our method significantly surpasses other state-of-the-art techniques in standard benchmarks by effectively minimizing the gap between training on labeled and unlabeled data.
arXiv Detail & Related papers (2024-05-01T07:16:03Z) - Binary Classification with Confidence Difference [100.08818204756093]
This paper delves into a novel weakly supervised binary classification problem called confidence-difference (ConfDiff) classification.
We propose a risk-consistent approach to tackle this problem and show that the estimation error bound the optimal convergence rate.
We also introduce a risk correction approach to mitigate overfitting problems, whose consistency and convergence rate are also proven.
arXiv Detail & Related papers (2023-10-09T11:44:50Z) - Cross-head mutual Mean-Teaching for semi-supervised medical image
segmentation [6.738522094694818]
Semi-supervised medical image segmentation (SSMIS) has witnessed substantial advancements by leveraging limited labeled data and abundant unlabeled data.
Existing state-of-the-art (SOTA) methods encounter challenges in accurately predicting labels for the unlabeled data.
We propose a novel Cross-head mutual mean-teaching Network (CMMT-Net) incorporated strong-weak data augmentation.
arXiv Detail & Related papers (2023-10-08T09:13:04Z) - Towards Better Certified Segmentation via Diffusion Models [62.21617614504225]
segmentation models can be vulnerable to adversarial perturbations, which hinders their use in critical-decision systems like healthcare or autonomous driving.
Recently, randomized smoothing has been proposed to certify segmentation predictions by adding Gaussian noise to the input to obtain theoretical guarantees.
In this paper, we address the problem of certifying segmentation prediction using a combination of randomized smoothing and diffusion models.
arXiv Detail & Related papers (2023-06-16T16:30:39Z) - Self-training with dual uncertainty for semi-supervised medical image
segmentation [9.538419502275975]
Traditional self-training methods can partially solve the problem of insufficient labeled data by generating pseudo labels for iterative training.
We add sample-level and pixel-level uncertainty to stabilize the training process based on the self-training framework.
Our proposed method achieves better segmentation performance on both datasets under the same settings.
arXiv Detail & Related papers (2023-04-10T07:57:24Z) - SoftMatch: Addressing the Quantity-Quality Trade-off in Semi-supervised
Learning [101.86916775218403]
This paper revisits the popular pseudo-labeling methods via a unified sample weighting formulation.
We propose SoftMatch to overcome the trade-off by maintaining both high quantity and high quality of pseudo-labels during training.
In experiments, SoftMatch shows substantial improvements across a wide variety of benchmarks, including image, text, and imbalanced classification.
arXiv Detail & Related papers (2023-01-26T03:53:25Z) - Adversarial Dual-Student with Differentiable Spatial Warping for
Semi-Supervised Semantic Segmentation [70.2166826794421]
We propose a differentiable geometric warping to conduct unsupervised data augmentation.
We also propose a novel adversarial dual-student framework to improve the Mean-Teacher.
Our solution significantly improves the performance and state-of-the-art results are achieved on both datasets.
arXiv Detail & Related papers (2022-03-05T17:36:17Z) - Uncertainty-Guided Mutual Consistency Learning for Semi-Supervised
Medical Image Segmentation [9.745971699005857]
We propose a novel uncertainty-guided mutual consistency learning framework for medical image segmentation.
It integrates intra-task consistency learning from up-to-date predictions for self-ensembling and cross-task consistency learning from task-level regularization to exploit geometric shape information.
Our method achieves performance gains by leveraging unlabeled data and outperforms existing semi-supervised segmentation methods.
arXiv Detail & Related papers (2021-12-05T08:19:41Z) - DMT: Dynamic Mutual Training for Semi-Supervised Learning [69.17919491907296]
Self-training methods usually rely on single model prediction confidence to filter low-confidence pseudo labels.
We propose mutual training between two different models by a dynamically re-weighted loss function, called Dynamic Mutual Training.
Our experiments show that DMT achieves state-of-the-art performance in both image classification and semantic segmentation.
arXiv Detail & Related papers (2020-04-18T03:12:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.