Automated Model Evaluation for Object Detection via Prediction Consistency and Reliablity
- URL: http://arxiv.org/abs/2508.12082v1
- Date: Sat, 16 Aug 2025 15:39:56 GMT
- Title: Automated Model Evaluation for Object Detection via Prediction Consistency and Reliablity
- Authors: Seungju Yoo, Hyuk Kwon, Joong-Won Hwang, Kibok Lee,
- Abstract summary: Prediction Consistency and Reliability (PCR) estimates detection performance without ground-truth labels.<n>We construct a meta-dataset by applying image corruptions of varying severity.<n>Results demonstrate that PCR yields more accurate performance estimates than existing AutoEval methods.
- Score: 5.008445480549045
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in computer vision have made training object detectors more efficient and effective; however, assessing their performance in real-world applications still relies on costly manual annotation. To address this limitation, we develop an automated model evaluation (AutoEval) framework for object detection. We propose Prediction Consistency and Reliability (PCR), which leverages the multiple candidate bounding boxes that conventional detectors generate before non-maximum suppression (NMS). PCR estimates detection performance without ground-truth labels by jointly measuring 1) the spatial consistency between boxes before and after NMS, and 2) the reliability of the retained boxes via the confidence scores of overlapping boxes. For a more realistic and scalable evaluation, we construct a meta-dataset by applying image corruptions of varying severity. Experimental results demonstrate that PCR yields more accurate performance estimates than existing AutoEval methods, and the proposed meta-dataset covers a wider range of detection performance. The code is available at https://github.com/YonseiML/autoeval-det.
Related papers
- Autonomous Concept Drift Threshold Determination [29.617054108315546]
Existing drift detection methods focus on designing sensitive test statistics.<n>We observe that model performance is highly sensitive to this threshold.<n>In this paper, we prove that a threshold that adapts over time can outperform any single fixed threshold.
arXiv Detail & Related papers (2025-11-13T04:31:39Z) - Cumulative Consensus Score: Label-Free and Model-Agnostic Evaluation of Object Detectors in Deployment [3.6178660238507843]
evaluating object detection models in deployment is challenging because ground-truth annotations are rarely available.<n>We introduce the Cumulative Consensus Score (CCS), a label-free metric that enables continuous monitoring and comparison of detectors in real-world settings.
arXiv Detail & Related papers (2025-09-16T09:24:37Z) - An Uncertainty-aware DETR Enhancement Framework for Object Detection [10.102900613370817]
We propose an uncertainty-aware enhancement framework for DETR-based object detectors.<n>We derive a Bayes Risk formulation to filter high-risk information and improve detection reliability.<n> Experiments on the COCO benchmark show that our method can be effectively integrated into existing DETR variants.
arXiv Detail & Related papers (2025-07-20T07:53:04Z) - RoHOI: Robustness Benchmark for Human-Object Interaction Detection [78.18946529195254]
Human-Object Interaction (HOI) detection is crucial for robot-human assistance, enabling context-aware support.<n>We introduce the first benchmark for HOI detection, evaluating model resilience under diverse challenges.<n>Our benchmark, RoHOI, includes 20 corruption types based on the HICO-DET and V-COCO datasets and a new robustness-focused metric.
arXiv Detail & Related papers (2025-07-12T01:58:04Z) - Open-set object detection: towards unified problem formulation and benchmarking [2.4374097382908477]
We introduce two benchmarks: a unified VOC-COCO evaluation, and the new OpenImagesRoad benchmark which provides clear hierarchical object definition besides new evaluation metrics.
State-of-the-art methods are extensively evaluated on the proposed benchmarks.
This study provides a clear problem definition, ensures consistent evaluations, and draws new conclusions about effectiveness of OSOD strategies.
arXiv Detail & Related papers (2024-11-08T13:40:01Z) - Uncertainty Estimation for 3D Object Detection via Evidential Learning [63.61283174146648]
We introduce a framework for quantifying uncertainty in 3D object detection by leveraging an evidential learning loss on Bird's Eye View representations in the 3D detector.
We demonstrate both the efficacy and importance of these uncertainty estimates on identifying out-of-distribution scenes, poorly localized objects, and missing (false negative) detections.
arXiv Detail & Related papers (2024-10-31T13:13:32Z) - Rank-DETR for High Quality Object Detection [52.82810762221516]
A highly performant object detector requires accurate ranking for the bounding box predictions.
In this work, we introduce a simple and highly performant DETR-based object detector by proposing a series of rank-oriented designs.
arXiv Detail & Related papers (2023-10-13T04:48:32Z) - Conservative Prediction via Data-Driven Confidence Minimization [70.93946578046003]
In safety-critical applications of machine learning, it is often desirable for a model to be conservative.
We propose the Data-Driven Confidence Minimization framework, which minimizes confidence on an uncertainty dataset.
arXiv Detail & Related papers (2023-06-08T07:05:36Z) - Towards Computational Performance Engineering for Unsupervised Concept Drift Detection -- Complexities, Benchmarking, Performance Analysis [4.720921955899519]
Concept drift detection is crucial for many AI systems to ensure the system's reliability.
These systems often have to deal with large amounts of data or react in real-time.
drift detectors must meet computational requirements or constraints with a comprehensive performance evaluation.
arXiv Detail & Related papers (2023-04-17T14:39:56Z) - Identifying Out-of-Distribution Samples in Real-Time for Safety-Critical
2D Object Detection with Margin Entropy Loss [0.0]
We present an approach to enable OOD detection for 2D object detection by employing the margin entropy (ME) loss.
A CNN trained with the ME loss significantly outperforms OOD detection using standard confidence scores.
arXiv Detail & Related papers (2022-09-01T11:14:57Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Out-of-Distribution Detection for Automotive Perception [58.34808836642603]
Neural networks (NNs) are widely used for object classification in autonomous driving.
NNs can fail on input data not well represented by the training dataset, known as out-of-distribution (OOD) data.
This paper presents a method for determining whether inputs are OOD, which does not require OOD data during training and does not increase the computational cost of inference.
arXiv Detail & Related papers (2020-11-03T01:46:35Z) - Seeing without Looking: Contextual Rescoring of Object Detections for AP
Maximization [4.346179456029563]
We propose to incorporate context in object detection by post-processing the output of an arbitrary detector.
Rescoring is done by conditioning on contextual information from the entire set of detections.
We show that AP can be improved by simply reassigning the detection confidence values.
arXiv Detail & Related papers (2019-12-27T18:56:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.