Identifying Label Errors in Object Detection Datasets by Loss Inspection
- URL: http://arxiv.org/abs/2303.06999v3
- Date: Tue, 19 Dec 2023 07:30:25 GMT
- Title: Identifying Label Errors in Object Detection Datasets by Loss Inspection
- Authors: Marius Schubert, Tobias Riedlinger, Karsten Kahl, Daniel Kr\"oll,
Sebastian Schoenen, Sini\v{s}a \v{S}egvi\'c, Matthias Rottmann
- Abstract summary: We introduce a benchmark for label error detection methods on object detection datasets.
We simulate four different types of randomly introduced label errors on train and test sets of well-labeled object detection datasets.
- Score: 4.442111891959355
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Labeling datasets for supervised object detection is a dull and
time-consuming task. Errors can be easily introduced during annotation and
overlooked during review, yielding inaccurate benchmarks and performance
degradation of deep neural networks trained on noisy labels. In this work, we
for the first time introduce a benchmark for label error detection methods on
object detection datasets as well as a label error detection method and a
number of baselines. We simulate four different types of randomly introduced
label errors on train and test sets of well-labeled object detection datasets.
For our label error detection method we assume a two-stage object detector to
be given and consider the sum of both stages' classification and regression
losses. The losses are computed with respect to the predictions and the noisy
labels including simulated label errors, aiming at detecting the latter. We
compare our method to three baselines: a naive one without deep learning, the
object detector's score and the entropy of the classification softmax
distribution. We outperform all baselines and demonstrate that among the
considered methods, ours is the only one that detects label errors of all four
types efficiently. Furthermore, we detect real label errors a) on commonly used
test datasets in object detection and b) on a proprietary dataset. In both
cases we achieve low false positives rates, i.e., we detect label errors with a
precision for a) of up to 71.5% and for b) with 97%.
Related papers
- Improving Label Error Detection and Elimination with Uncertainty Quantification [5.184615738004059]
We develop novel, model-agnostic algorithms for Uncertainty Quantification-Based Label Error Detection (UQ-LED)
Our UQ-LED algorithms outperform state-of-the-art confident learning in identifying label errors.
We propose a novel approach to generate realistic, class-dependent label errors synthetically.
arXiv Detail & Related papers (2024-05-15T15:17:52Z) - Estimating label quality and errors in semantic segmentation data via
any model [19.84626033109009]
We study methods to score label quality, such that the images with the lowest scores are least likely to be correctly labeled.
This helps prioritize what data to review in order to ensure a high-quality training/evaluation dataset.
arXiv Detail & Related papers (2023-07-11T07:29:09Z) - Ambiguity-Resistant Semi-Supervised Learning for Dense Object Detection [98.66771688028426]
We propose a Ambiguity-Resistant Semi-supervised Learning (ARSL) for one-stage detectors.
Joint-Confidence Estimation (JCE) is proposed to quantifies the classification and localization quality of pseudo labels.
ARSL effectively mitigates the ambiguities and achieves state-of-the-art SSOD performance on MS COCO and PASCAL VOC.
arXiv Detail & Related papers (2023-03-27T07:46:58Z) - Dist-PU: Positive-Unlabeled Learning from a Label Distribution
Perspective [89.5370481649529]
We propose a label distribution perspective for PU learning in this paper.
Motivated by this, we propose to pursue the label distribution consistency between predicted and ground-truth label distributions.
Experiments on three benchmark datasets validate the effectiveness of the proposed method.
arXiv Detail & Related papers (2022-12-06T07:38:29Z) - Detecting Label Errors in Token Classification Data [22.539748563923123]
We consider the task of finding sentences that contain label errors in token classification datasets.
We study 11 different straightforward methods that score tokens/sentences based on the predicted class probabilities.
We identify a simple and effective method that consistently detects those sentences containing label errors when applied with different token classification models.
arXiv Detail & Related papers (2022-10-08T05:14:22Z) - CTRL: Clustering Training Losses for Label Error Detection [4.49681473359251]
In supervised machine learning, use of correct labels is extremely important to ensure high accuracy.
We propose a novel framework, calledClustering TRaining Losses for label error detection.
It detects label errors in two steps based on the observation that models learn clean and noisy labels in different ways.
arXiv Detail & Related papers (2022-08-17T18:09:19Z) - Automated Detection of Label Errors in Semantic Segmentation Datasets via Deep Learning and Uncertainty Quantification [5.279257531335345]
We for the first time present a method for detecting label errors in semantic segmentation datasets with pixel-wise labels.
Our approach is able to detect the vast majority of label errors while controlling the number of false label error detections.
arXiv Detail & Related papers (2022-07-13T10:25:23Z) - Attentive Prototypes for Source-free Unsupervised Domain Adaptive 3D
Object Detection [85.11649974840758]
3D object detection networks tend to be biased towards the data they are trained on.
We propose a single-frame approach for source-free, unsupervised domain adaptation of lidar-based 3D object detectors.
arXiv Detail & Related papers (2021-11-30T18:42:42Z) - Rethinking Pseudo Labels for Semi-Supervised Object Detection [84.697097472401]
We introduce certainty-aware pseudo labels tailored for object detection.
We dynamically adjust the thresholds used to generate pseudo labels and reweight loss functions for each category to alleviate the class imbalance problem.
Our approach improves supervised baselines by up to 10% AP using only 1-10% labeled data from COCO.
arXiv Detail & Related papers (2021-06-01T01:32:03Z) - Few-shot Learning for Multi-label Intent Detection [59.66787898744991]
State-of-the-art work estimates label-instance relevance scores and uses a threshold to select multiple associated intent labels.
Experiments on two datasets show that the proposed model significantly outperforms strong baselines in both one-shot and five-shot settings.
arXiv Detail & Related papers (2020-10-11T14:42:18Z) - Object Detection as a Positive-Unlabeled Problem [78.2955013126312]
We propose treating object detection as a positive-unlabeled (PU) problem, which removes the assumption that unlabeled regions must be negative.
We demonstrate that our proposed PU classification loss outperforms the standard PN loss on PASCAL VOC and MS COCO across a range of label missingness, as well as on Visual Genome and DeepLesion with full labels.
arXiv Detail & Related papers (2020-02-11T20:49:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.