Empirical Upper Bound, Error Diagnosis and Invariance Analysis of Modern
Object Detectors
- URL: http://arxiv.org/abs/2004.02877v1
- Date: Sun, 5 Apr 2020 06:19:43 GMT
- Title: Empirical Upper Bound, Error Diagnosis and Invariance Analysis of Modern
Object Detectors
- Authors: Ali Borji
- Abstract summary: We employ 2 state-of-the-art object detection benchmarks, and analyze more than 15 models over 4 large scale datasets.
We find that models generate a lot of boxes on empty regions and that context is more important for detecting small objects than larger ones.
- Score: 47.64219291655723
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Object detection remains as one of the most notorious open problems in
computer vision. Despite large strides in accuracy in recent years, modern
object detectors have started to saturate on popular benchmarks raising the
question of how far we can reach with deep learning tools and tricks. Here, by
employing 2 state-of-the-art object detection benchmarks, and analyzing more
than 15 models over 4 large scale datasets, we I) carefully determine the upper
bound in AP, which is 91.6% on VOC (test2007), 78.2% on COCO (val2017), and
58.9% on OpenImages V4 (validation), regardless of the IOU threshold. These
numbers are much better than the mAP of the best model (47.9% on VOC, and 46.9%
on COCO; IOUs=.5:.05:.95), II) characterize the sources of errors in object
detectors, in a novel and intuitive way, and find that classification error
(confusion with other classes and misses) explains the largest fraction of
errors and weighs more than localization and duplicate errors, and III) analyze
the invariance properties of models when surrounding context of an object is
removed, when an object is placed in an incongruent background, and when images
are blurred or flipped vertically. We find that models generate a lot of boxes
on empty regions and that context is more important for detecting small objects
than larger ones. Our work taps into the tight relationship between object
detection and object recognition and offers insights for building better
models. Our code is publicly available at
https://github.com/aliborji/Deetctionupper bound.git.
Related papers
- FADE: Few-shot/zero-shot Anomaly Detection Engine using Large Vision-Language Model [0.9226774742769024]
Few-shot/zero-shot anomaly detection is important for quality inspection in the manufacturing industry.
We propose the Few-shot/zero-shot Anomaly Engine Detection (FADE) which leverages the vision-language CLIP model and adjusts it for the purpose of anomaly detection.
FADE outperforms other state-of-the-art methods in anomaly segmentation with pixel-AUROC of 89.6% (91.5%) in zero-shot and 95.4% (97.5%) in 1-normal-shot.
arXiv Detail & Related papers (2024-08-31T23:05:56Z) - Bayesian Detector Combination for Object Detection with Crowdsourced Annotations [49.43709660948812]
Acquiring fine-grained object detection annotations in unconstrained images is time-consuming, expensive, and prone to noise.
We propose a novel Bayesian Detector Combination (BDC) framework to more effectively train object detectors with noisy crowdsourced annotations.
BDC is model-agnostic, requires no prior knowledge of the annotators' skill level, and seamlessly integrates with existing object detection models.
arXiv Detail & Related papers (2024-07-10T18:00:54Z) - YOLC: You Only Look Clusters for Tiny Object Detection in Aerial Images [33.80392696735718]
YOLC (You Only Look Clusters) is an efficient and effective framework that builds on an anchor-free object detector, CenterNet.
To overcome the challenges posed by large-scale images and non-uniform object distribution, we introduce a Local Scale Module (LSM) that adaptively searches cluster regions for zooming in for accurate detection.
We perform extensive experiments on two aerial image datasets, including Visdrone 2019 and UAVDT, to demonstrate the effectiveness and superiority of our proposed approach.
arXiv Detail & Related papers (2024-04-09T10:03:44Z) - Innovative Horizons in Aerial Imagery: LSKNet Meets DiffusionDet for
Advanced Object Detection [55.2480439325792]
We present an in-depth evaluation of an object detection model that integrates the LSKNet backbone with the DiffusionDet head.
The proposed model achieves a mean average precision (MAP) of approximately 45.7%, which is a significant improvement.
This advancement underscores the effectiveness of the proposed modifications and sets a new benchmark in aerial image analysis.
arXiv Detail & Related papers (2023-11-21T19:49:13Z) - Natural Adversarial Objects [10.940015831720144]
We introduce a new dataset, Natural Adversarial Objects (NAO), to evaluate the robustness of object detection models.
NAO contains 7,934 images and 9,943 objects that are unmodified and representative of real-world scenarios.
arXiv Detail & Related papers (2021-11-07T23:42:55Z) - Contemplating real-world object classification [53.10151901863263]
We reanalyze the ObjectNet dataset recently proposed by Barbu et al. containing objects in daily life situations.
We find that applying deep models to the isolated objects, rather than the entire scene as is done in the original paper, results in around 20-30% performance improvement.
arXiv Detail & Related papers (2021-03-08T23:29:59Z) - A Systematic Evaluation of Object Detection Networks for Scientific
Plots [17.882932963813985]
We train and compare the accuracy of various SOTA object detection networks on the PlotQA dataset.
At the standard IOU setting of 0.5, most networks perform well with mAP scores greater than 80% in detecting the relatively simple objects in plots.
However, the performance drops drastically when evaluated at a stricter IOU of 0.9 with the best model giving a mAP of 35.70%.
arXiv Detail & Related papers (2020-07-05T05:30:53Z) - Large-Scale Object Detection in the Wild from Imbalanced Multi-Labels [128.77822070156057]
In this work, we quantitatively analyze label problems that objects may explicitly or implicitly have multiple labels.
We propose a soft-sampling methods with hybrid training scheduler to deal with the label imbalance.
Our method yields a dramatic improvement of 3.34 points, leading to the best single model with 60.90 mAP on the public object detection test set of Open Images.
arXiv Detail & Related papers (2020-05-18T04:36:36Z) - TACRED Revisited: A Thorough Evaluation of the TACRED Relation
Extraction Task [80.38130122127882]
TACRED is one of the largest, most widely used crowdsourced datasets in Relation Extraction (RE)
In this paper, we investigate the questions: Have we reached a performance ceiling or is there still room for improvement?
We find that label errors account for 8% absolute F1 test error, and that more than 50% of the examples need to be relabeled.
arXiv Detail & Related papers (2020-04-30T15:07:37Z) - Learning Gaussian Maps for Dense Object Detection [1.8275108630751844]
We review common and highly accurate object detection methods on the scenes where numerous similar looking objects are placed in close proximity with each other.
We show that, multi-task learning of gaussian maps along with classification and bounding box regression gives us a significant boost in accuracy over the baseline.
Our method also achieves the state of the art accuracy on the SKU110K citesku110k dataset.
arXiv Detail & Related papers (2020-04-24T17:01:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.