Related papers: Cumulative Consensus Score: Label-Free and Model-Agnostic Evaluation of Object Detectors in Deployment

Cumulative Consensus Score: Label-Free and Model-Agnostic Evaluation of Object Detectors in Deployment

URL: http://arxiv.org/abs/2509.12871v1
Date: Tue, 16 Sep 2025 09:24:37 GMT
Title: Cumulative Consensus Score: Label-Free and Model-Agnostic Evaluation of Object Detectors in Deployment
Authors: Avinaash Manoharan, Xiangyu Yin, Domenik Helm, Chih-Hong Cheng,
Abstract summary: evaluating object detection models in deployment is challenging because ground-truth annotations are rarely available.<n>We introduce the Cumulative Consensus Score (CCS), a label-free metric that enables continuous monitoring and comparison of detectors in real-world settings.
Score: 3.6178660238507843
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Evaluating object detection models in deployment is challenging because ground-truth annotations are rarely available. We introduce the Cumulative Consensus Score (CCS), a label-free metric that enables continuous monitoring and comparison of detectors in real-world settings. CCS applies test-time data augmentation to each image, collects predicted bounding boxes across augmented views, and computes overlaps using Intersection over Union. Maximum overlaps are normalized and averaged across augmentation pairs, yielding a measure of spatial consistency that serves as a proxy for reliability without annotations. In controlled experiments on Open Images and KITTI, CCS achieved over 90% congruence with F1-score, Probabilistic Detection Quality, and Optimal Correction Cost. The method is model-agnostic, working across single-stage and two-stage detectors, and operates at the case level to highlight under-performing scenarios. Altogether, CCS provides a robust foundation for DevOps-style monitoring of object detectors.

Related papers

From Global to Granular: Revealing IQA Model Performance via Correlation Surface [83.65597122328133]
We present textbfGranularity-Modulated Correlation (GMC), which provides a structured, fine-grained analysis of IQA performance.<n>GMC includes a textbfDistribution Regulator that regularizes correlations to mitigate biases from non-uniform quality distributions.<n>Experiments on standard benchmarks show that GMC reveals performance characteristics invisible to scalar metrics, offering a more informative and reliable paradigm for analyzing, comparing, and deploying IQA models.
arXiv Detail & Related papers (2026-01-29T13:55:26Z)
SMKC: Sketch Based Kernel Correlation Images for Variable Cardinality Time Series Anomaly Detection [0.0]
In operational environments, monitoring systems frequently experience sensor churn.<n>We propose SMKC, a framework that decouples the dynamic input structure from the anomaly detector.<n>We find that a detector using random projections and nearest neighbors on the SMKC representation performs competitively with fully trained baselines.
arXiv Detail & Related papers (2026-01-28T21:15:11Z)
CLIP-Joint-Detect: End-to-End Joint Training of Object Detectors with Contrastive Vision-Language Supervision [0.08699280339422537]
We propose CLIP-Joint-Detect, a framework that integrates CLIP-style contrastive vision-language supervision through end-to-end joint training.<n>A lightweight parallel head projects region or grid features into the CLIP embedding space and aligns them with learnable class-specific text embeddings via InfoNCE contrastive loss and an auxiliary cross-entropy term.<n>We validate it on Pascal VOC 2007+2012 using Faster R-CNN and on the large-scale MS 2017 benchmark using modern YOLO detectors (YOLOv11)
arXiv Detail & Related papers (2025-12-28T15:21:20Z)
Automated Model Evaluation for Object Detection via Prediction Consistency and Reliablity [5.008445480549045]
Prediction Consistency and Reliability (PCR) estimates detection performance without ground-truth labels.<n>We construct a meta-dataset by applying image corruptions of varying severity.<n>Results demonstrate that PCR yields more accurate performance estimates than existing AutoEval methods.
arXiv Detail & Related papers (2025-08-16T15:39:56Z)
Prototype-Guided Pseudo-Labeling with Neighborhood-Aware Consistency for Unsupervised Adaptation [12.829638461740759]
In unsupervised adaptation for vision-language models such as CLIP, pseudo-labels from zero-shot predictions often exhibit significant noise.<n>We propose a novel adaptive pseudo-labeling framework that enhances CLIP's adaptation performance by integrating prototype consistency and neighborhood-based consistency.<n>Our method achieves state-of-the-art performance in unsupervised adaptation scenarios, delivering more accurate pseudo-labels while maintaining computational efficiency.
arXiv Detail & Related papers (2025-07-22T19:08:24Z)
ConformalSAM: Unlocking the Potential of Foundational Segmentation Models in Semi-Supervised Semantic Segmentation with Conformal Prediction [57.930531826380836]
This work explores whether a foundational segmentation model can address label scarcity in the pixel-level vision task as an annotator for unlabeled images.<n>We propose ConformalSAM, a novel SSSS framework which first calibrates the foundation model using the target domain's labeled data and then filters out unreliable pixel labels of unlabeled data.
arXiv Detail & Related papers (2025-07-21T17:02:57Z)
Multi-clue Consistency Learning to Bridge Gaps Between General and Oriented Object in Semi-supervised Detection [26.486535389258965]
We experimentally find three gaps between general and oriented object detection in semi-supervised learning. We propose a Multi-clue Consistency Learning (MCL) framework to bridge these gaps. Our proposed MCL can achieve state-of-the-art performance in the semi-supervised oriented object detection task.
arXiv Detail & Related papers (2024-07-08T13:14:25Z)
Importance of Disjoint Sampling in Conventional and Transformer Models for Hyperspectral Image Classification [2.1223532600703385]
This paper presents an innovative disjoint sampling approach for training SOTA models on Hyperspectral image classification (HSIC) tasks. By separating training, validation, and test data without overlap, the proposed method facilitates a fairer evaluation of how well a model can classify pixels it was not exposed to during training or validation. This rigorous methodology is critical for advancing SOTA models and their real-world application to large-scale land mapping with Hyperspectral sensors.
arXiv Detail & Related papers (2024-04-23T11:40:52Z)
Semi-DETR: Semi-Supervised Object Detection with Detection Transformers [105.45018934087076]
We analyze the DETR-based framework on semi-supervised object detection (SSOD) We present Semi-DETR, the first transformer-based end-to-end semi-supervised object detector. Our method outperforms all state-of-the-art methods by clear margins.
arXiv Detail & Related papers (2023-07-16T16:32:14Z)
Revisiting the Evaluation of Image Synthesis with GANs [55.72247435112475]
This study presents an empirical investigation into the evaluation of synthesis performance, with generative adversarial networks (GANs) as a representative of generative models. In particular, we make in-depth analyses of various factors, including how to represent a data point in the representation space, how to calculate a fair distance using selected samples, and how many instances to use from each set.
arXiv Detail & Related papers (2023-04-04T17:54:32Z)
Generalised Co-Salient Object Detection [50.876864826216924]
We propose a new setting that relaxes an assumption in the conventional Co-Salient Object Detection (CoSOD) setting. We call this new setting Generalised Co-Salient Object Detection (GCoSOD) We propose a novel random sampling based Generalised CoSOD Training (GCT) strategy to distill the awareness of inter-image absence of co-salient objects into CoSOD models.
arXiv Detail & Related papers (2022-08-20T12:23:32Z)
WSSOD: A New Pipeline for Weakly- and Semi-Supervised Object Detection [75.80075054706079]
We propose a weakly- and semi-supervised object detection framework (WSSOD) An agent detector is first trained on a joint dataset and then used to predict pseudo bounding boxes on weakly-annotated images. The proposed framework demonstrates remarkable performance on PASCAL-VOC and MSCOCO benchmark, achieving a high performance comparable to those obtained in fully-supervised settings.
arXiv Detail & Related papers (2021-05-21T11:58:50Z)
Probabilistic Ranking-Aware Ensembles for Enhanced Object Detections [50.096540945099704]
We propose a novel ensemble called the Probabilistic Ranking Aware Ensemble (PRAE) that refines the confidence of bounding boxes from detectors. We also introduce a bandit approach to address the confidence imbalance problem caused by the need to deal with different numbers of boxes.
arXiv Detail & Related papers (2021-05-07T09:37:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.