Automated Detection of Label Errors in Semantic Segmentation Datasets via Deep Learning and Uncertainty Quantification
- URL: http://arxiv.org/abs/2207.06104v2
- Date: Fri, 23 Aug 2024 19:47:25 GMT
- Title: Automated Detection of Label Errors in Semantic Segmentation Datasets via Deep Learning and Uncertainty Quantification
- Authors: Matthias Rottmann, Marco Reese,
- Abstract summary: We for the first time present a method for detecting label errors in semantic segmentation datasets with pixel-wise labels.
Our approach is able to detect the vast majority of label errors while controlling the number of false label error detections.
- Score: 5.279257531335345
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work, we for the first time present a method for detecting label errors in image datasets with semantic segmentation, i.e., pixel-wise class labels. Annotation acquisition for semantic segmentation datasets is time-consuming and requires plenty of human labor. In particular, review processes are time consuming and label errors can easily be overlooked by humans. The consequences are biased benchmarks and in extreme cases also performance degradation of deep neural networks (DNNs) trained on such datasets. DNNs for semantic segmentation yield pixel-wise predictions, which makes detection of label errors via uncertainty quantification a complex task. Uncertainty is particularly pronounced at the transitions between connected components of the prediction. By lifting the consideration of uncertainty to the level of predicted components, we enable the usage of DNNs together with component-level uncertainty quantification for the detection of label errors. We present a principled approach to benchmarking the task of label error detection by dropping labels from the Cityscapes dataset as well from a dataset extracted from the CARLA driving simulator, where in the latter case we have the labels under control. Our experiments show that our approach is able to detect the vast majority of label errors while controlling the number of false label error detections. Furthermore, we apply our method to semantic segmentation datasets frequently used by the computer vision community and present a collection of label errors along with sample statistics.
Related papers
- Alternative Pseudo-Labeling for Semi-Supervised Automatic Speech
Recognition [49.42732949233184]
When labeled data is insufficient, semi-supervised learning with the pseudo-labeling technique can significantly improve the performance of automatic speech recognition.
Taking noisy labels as ground-truth in the loss function results in suboptimal performance.
We propose a novel framework named alternative pseudo-labeling to tackle the issue of noisy pseudo-labels.
arXiv Detail & Related papers (2023-08-12T12:13:52Z) - Estimating label quality and errors in semantic segmentation data via
any model [19.84626033109009]
We study methods to score label quality, such that the images with the lowest scores are least likely to be correctly labeled.
This helps prioritize what data to review in order to ensure a high-quality training/evaluation dataset.
arXiv Detail & Related papers (2023-07-11T07:29:09Z) - All Points Matter: Entropy-Regularized Distribution Alignment for
Weakly-supervised 3D Segmentation [67.30502812804271]
Pseudo-labels are widely employed in weakly supervised 3D segmentation tasks where only sparse ground-truth labels are available for learning.
We propose a novel learning strategy to regularize the generated pseudo-labels and effectively narrow the gaps between pseudo-labels and model predictions.
arXiv Detail & Related papers (2023-05-25T08:19:31Z) - Identifying Label Errors in Object Detection Datasets by Loss Inspection [4.442111891959355]
We introduce a benchmark for label error detection methods on object detection datasets.
We simulate four different types of randomly introduced label errors on train and test sets of well-labeled object detection datasets.
arXiv Detail & Related papers (2023-03-13T10:54:52Z) - Dist-PU: Positive-Unlabeled Learning from a Label Distribution
Perspective [89.5370481649529]
We propose a label distribution perspective for PU learning in this paper.
Motivated by this, we propose to pursue the label distribution consistency between predicted and ground-truth label distributions.
Experiments on three benchmark datasets validate the effectiveness of the proposed method.
arXiv Detail & Related papers (2022-12-06T07:38:29Z) - CTRL: Clustering Training Losses for Label Error Detection [4.49681473359251]
In supervised machine learning, use of correct labels is extremely important to ensure high accuracy.
We propose a novel framework, calledClustering TRaining Losses for label error detection.
It detects label errors in two steps based on the observation that models learn clean and noisy labels in different ways.
arXiv Detail & Related papers (2022-08-17T18:09:19Z) - Semi-supervised Semantic Segmentation with Error Localization Network [16.42221567235617]
This paper studies semi-supervised learning of semantic segmentation.
It assumes that only a small portion of training images are labeled and the others remain unlabeled.
The unlabeled images are usually assigned pseudo labels to be used in training.
We present a novel method that resolves this chronic issue of pseudo labeling.
arXiv Detail & Related papers (2022-04-05T09:42:21Z) - GuidedMix-Net: Semi-supervised Semantic Segmentation by Using Labeled
Images as Reference [90.5402652758316]
We propose a novel method for semi-supervised semantic segmentation named GuidedMix-Net.
It uses labeled information to guide the learning of unlabeled instances.
It achieves competitive segmentation accuracy and significantly improves the mIoU by +7$%$ compared to previous approaches.
arXiv Detail & Related papers (2021-12-28T06:48:03Z) - GuidedMix-Net: Learning to Improve Pseudo Masks Using Labeled Images as
Reference [153.354332374204]
We propose a novel method for semi-supervised semantic segmentation named GuidedMix-Net.
We first introduce a feature alignment objective between labeled and unlabeled data to capture potentially similar image pairs.
MITrans is shown to be a powerful knowledge module for further progressive refining features of unlabeled data.
Along with supervised learning for labeled data, the prediction of unlabeled data is jointly learned with the generated pseudo masks.
arXiv Detail & Related papers (2021-06-29T02:48:45Z) - Boosting Semi-Supervised Face Recognition with Noise Robustness [54.342992887966616]
This paper presents an effective solution to semi-supervised face recognition that is robust to the label noise aroused by the auto-labelling.
We develop a semi-supervised face recognition solution, named Noise Robust Learning-Labelling (NRoLL), which is based on the robust training ability empowered by GN.
arXiv Detail & Related papers (2021-05-10T14:43:11Z) - Uncertainty-based method for improving poorly labeled segmentation
datasets [0.0]
It is known that deep convolutional neural networks (DCNNs) can memorize even completely random labels.
We propose a framework to train binary segmentation DCNNs using sets of unreliable pixel-level annotations.
arXiv Detail & Related papers (2021-02-16T08:37:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.