Benchmarking Object Detectors with COCO: A New Path Forward
- URL: http://arxiv.org/abs/2403.18819v1
- Date: Wed, 27 Mar 2024 17:59:53 GMT
- Title: Benchmarking Object Detectors with COCO: A New Path Forward
- Authors: Shweta Singh, Aayan Yadav, Jitesh Jain, Humphrey Shi, Justin Johnson, Karan Desai,
- Abstract summary: The Common Objects in Context (COCO) dataset has been instrumental in benchmarking object detectors over the past decade.
With the advent of high-labeled models, we ask whether these errors of COCO are hindering its utility in reliably benchmarking object detectors.
We uncover different types of errors such as imprecise mask boundaries, non-exhaustively annotated instances, and misperforming masks.
- Score: 26.754253266204184
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The Common Objects in Context (COCO) dataset has been instrumental in benchmarking object detectors over the past decade. Like every dataset, COCO contains subtle errors and imperfections stemming from its annotation procedure. With the advent of high-performing models, we ask whether these errors of COCO are hindering its utility in reliably benchmarking further progress. In search for an answer, we inspect thousands of masks from COCO (2017 version) and uncover different types of errors such as imprecise mask boundaries, non-exhaustively annotated instances, and mislabeled masks. Due to the prevalence of COCO, we choose to correct these errors to maintain continuity with prior research. We develop COCO-ReM (Refined Masks), a cleaner set of annotations with visibly better mask quality than COCO-2017. We evaluate fifty object detectors and find that models that predict visually sharper masks score higher on COCO-ReM, affirming that they were being incorrectly penalized due to errors in COCO-2017. Moreover, our models trained using COCO-ReM converge faster and score higher than their larger variants trained using COCO-2017, highlighting the importance of data quality in improving object detectors. With these findings, we advocate using COCO-ReM for future object detection research. Our dataset is available at https://cocorem.xyz
Related papers
- From COCO to COCO-FP: A Deep Dive into Background False Positives for COCO Detectors [8.3561487803637]
False alarms in fire and smoke detection are critical in real-world applications.
COCO-FP is a new evaluation dataset derived from the ImageNet-1K dataset.
Our evaluation of both standard and advanced object detectors shows a significant number of false positives in both closed-set and open-set scenarios.
arXiv Detail & Related papers (2024-09-12T10:22:12Z) - Catastrophic Overfitting: A Potential Blessing in Disguise [51.996943482875366]
Fast Adversarial Training (FAT) has gained increasing attention within the research community owing to its efficacy in improving adversarial robustness.
Although existing FAT approaches have made strides in mitigating CO, the ascent of adversarial robustness occurs with a non-negligible decline in classification accuracy on clean samples.
We employ the feature activation differences between clean and adversarial examples to analyze the underlying causes of CO.
We harness CO to achieve attack obfuscation', aiming to bolster model performance.
arXiv Detail & Related papers (2024-02-28T10:01:44Z) - Mixed Pseudo Labels for Semi-Supervised Object Detection [27.735659283870646]
This paper proposes Mixed Pseudo Labels (MixPL), consisting of Mixup and Mosaic for pseudo-labeled data, to mitigate the negative impact of missed detections.
MixPL consistently improves the performance of various detectors and obtains new state-of-the-art results with Faster R-CNN, FCOS, and DINO on COCO-Standard and COCO-Full benchmarks.
arXiv Detail & Related papers (2023-12-12T06:35:27Z) - COCO-O: A Benchmark for Object Detectors under Natural Distribution
Shifts [27.406639379618003]
COCO-O is a test dataset based on COCO with 6 types of natural distribution shifts.
COCO-O has a large distribution gap with training data and results in a significant 55.7% relative performance drop on a Faster R-CNN detector.
We study the robustness effect on recent breakthroughs of detector's architecture design, augmentation and pre-training techniques.
arXiv Detail & Related papers (2023-07-24T12:22:19Z) - Cut and Learn for Unsupervised Object Detection and Instance
Segmentation [65.43627672225624]
Cut-and-LEaRn (CutLER) is a simple approach for training unsupervised object detection and segmentation models.
CutLER is a zero-shot unsupervised detector and improves detection performance AP50 by over 2.7 times on 11 benchmarks.
arXiv Detail & Related papers (2023-01-26T18:57:13Z) - ECCV Caption: Correcting False Negatives by Collecting
Machine-and-Human-verified Image-Caption Associations for MS-COCO [47.61229316655264]
We construct the Extended COCO Validation (ECCV) Caption dataset by supplying the missing associations with machine and human annotators.
Our dataset provides x3.6 positive image-to-caption associations and x8.5 caption-to-image associations compared to the original MS-COCO.
Our findings are that the existing benchmarks, such as COCO 1K R@K, COCO 5K R@K, CxC R@1 are highly correlated with each other, while the rankings change when we shift to the ECCV mAP@R.
arXiv Detail & Related papers (2022-04-07T10:57:12Z) - Evaluating Prediction-Time Batch Normalization for Robustness under
Covariate Shift [81.74795324629712]
We call prediction-time batch normalization, which significantly improves model accuracy and calibration under covariate shift.
We show that prediction-time batch normalization provides complementary benefits to existing state-of-the-art approaches for improving robustness.
The method has mixed results when used alongside pre-training, and does not seem to perform as well under more natural types of dataset shift.
arXiv Detail & Related papers (2020-06-19T05:08:43Z) - A Simple Semi-Supervised Learning Framework for Object Detection [55.95789931533665]
Semi-supervised learning (SSL) has a potential to improve the predictive performance of machine learning models using unlabeled data.
We propose STAC, a simple yet effective SSL framework for visual object detection along with a data augmentation strategy.
arXiv Detail & Related papers (2020-05-10T19:15:51Z) - Detection in Crowded Scenes: One Proposal, Multiple Predictions [79.28850977968833]
We propose a proposal-based object detector, aiming at detecting highly-overlapped instances in crowded scenes.
The key of our approach is to let each proposal predict a set of correlated instances rather than a single one in previous proposal-based frameworks.
Our detector can obtain 4.9% AP gains on challenging CrowdHuman dataset and 1.0% $textMR-2$ improvements on CityPersons dataset.
arXiv Detail & Related papers (2020-03-20T09:48:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.