Cal-DETR: Calibrated Detection Transformer
- URL: http://arxiv.org/abs/2311.03570v1
- Date: Mon, 6 Nov 2023 22:13:10 GMT
- Title: Cal-DETR: Calibrated Detection Transformer
- Authors: Muhammad Akhtar Munir, Salman Khan, Muhammad Haris Khan, Mohsen Ali,
Fahad Shahbaz Khan
- Abstract summary: We propose a mechanism for calibrated detection transformers (Cal-DETR), particularly for Deformable-DETR, UP-DETR and DINO.
We develop an uncertainty-guided logit modulation mechanism that leverages the uncertainty to modulate the class logits.
Results corroborate the effectiveness of Cal-DETR against the competing train-time methods in calibrating both in-domain and out-domain detections.
- Score: 67.75361289429013
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Albeit revealing impressive predictive performance for several computer
vision tasks, deep neural networks (DNNs) are prone to making overconfident
predictions. This limits the adoption and wider utilization of DNNs in many
safety-critical applications. There have been recent efforts toward calibrating
DNNs, however, almost all of them focus on the classification task.
Surprisingly, very little attention has been devoted to calibrating modern
DNN-based object detectors, especially detection transformers, which have
recently demonstrated promising detection performance and are influential in
many decision-making systems. In this work, we address the problem by proposing
a mechanism for calibrated detection transformers (Cal-DETR), particularly for
Deformable-DETR, UP-DETR and DINO. We pursue the train-time calibration route
and make the following contributions. First, we propose a simple yet effective
approach for quantifying uncertainty in transformer-based object detectors.
Second, we develop an uncertainty-guided logit modulation mechanism that
leverages the uncertainty to modulate the class logits. Third, we develop a
logit mixing approach that acts as a regularizer with detection-specific losses
and is also complementary to the uncertainty-guided logit modulation technique
to further improve the calibration performance. Lastly, we conduct extensive
experiments across three in-domain and four out-domain scenarios. Results
corroborate the effectiveness of Cal-DETR against the competing train-time
methods in calibrating both in-domain and out-domain detections while
maintaining or even improving the detection performance. Our codebase and
pre-trained models can be accessed at
\url{https://github.com/akhtarvision/cal-detr}.
Related papers
- On Calibration of Object Detectors: Pitfalls, Evaluation and Baselines [15.306933156466522]
Reliable usage of object detectors require them to be calibrated.
Recent approaches involve designing new loss functions to obtain calibrated detectors by training them from scratch.
We propose a principled evaluation framework to jointly measure calibration and accuracy of object detectors.
arXiv Detail & Related papers (2024-05-30T20:12:14Z) - Beyond Classification: Definition and Density-based Estimation of
Calibration in Object Detection [15.71719154574049]
We tackle the challenge of defining and estimating calibration error for deep neural networks (DNNs)
In particular, we adapt the definition of classification calibration error to handle the nuances associated with object detection.
We propose a consistent and differentiable estimator of the detection calibration error, utilizing kernel density estimation.
arXiv Detail & Related papers (2023-12-11T18:57:05Z) - Rank-DETR for High Quality Object Detection [52.82810762221516]
A highly performant object detector requires accurate ranking for the bounding box predictions.
In this work, we introduce a simple and highly performant DETR-based object detector by proposing a series of rank-oriented designs.
arXiv Detail & Related papers (2023-10-13T04:48:32Z) - Multiclass Confidence and Localization Calibration for Object Detection [4.119048608751183]
Deep neural networks (DNNs) tend to make overconfident predictions, rendering them poorly calibrated.
We propose a new train-time technique for calibrating modern object detection methods.
arXiv Detail & Related papers (2023-06-14T06:14:16Z) - Bridging Precision and Confidence: A Train-Time Loss for Calibrating
Object Detection [58.789823426981044]
We propose a novel auxiliary loss formulation that aims to align the class confidence of bounding boxes with the accurateness of predictions.
Our results reveal that our train-time loss surpasses strong calibration baselines in reducing calibration error for both in and out-domain scenarios.
arXiv Detail & Related papers (2023-03-25T08:56:21Z) - Towards Improving Calibration in Object Detection Under Domain Shift [9.828212203380133]
We study the calibration of current object detection models, particularly under domain shift.
We introduce a plug-and-play train-time calibration loss for object detection.
Second, we devise a new uncertainty mechanism for object detection which can implicitly calibrate the commonly used self-training based domain adaptive detectors.
arXiv Detail & Related papers (2022-09-15T20:32:28Z) - On the Dark Side of Calibration for Modern Neural Networks [65.83956184145477]
We show the breakdown of expected calibration error (ECE) into predicted confidence and refinement.
We highlight that regularisation based calibration only focuses on naively reducing a model's confidence.
We find that many calibration approaches with the likes of label smoothing, mixup etc. lower the utility of a DNN by degrading its refinement.
arXiv Detail & Related papers (2021-06-17T11:04:14Z) - Rethinking Transformer-based Set Prediction for Object Detection [57.7208561353529]
Experimental results show that the proposed methods not only converge much faster than the original DETR, but also significantly outperform DETR and other baselines in terms of detection accuracy.
arXiv Detail & Related papers (2020-11-21T21:59:42Z) - SADet: Learning An Efficient and Accurate Pedestrian Detector [68.66857832440897]
This paper proposes a series of systematic optimization strategies for the detection pipeline of one-stage detector.
It forms a single shot anchor-based detector (SADet) for efficient and accurate pedestrian detection.
Though structurally simple, it presents state-of-the-art result and real-time speed of $20$ FPS for VGA-resolution images.
arXiv Detail & Related papers (2020-07-26T12:32:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.