Related papers: BEA: Revisiting anchor-based object detection DNN using Budding Ensemble Architecture

BEA: Revisiting anchor-based object detection DNN using Budding Ensemble Architecture

URL: http://arxiv.org/abs/2309.08036v4
Date: Fri, 10 Nov 2023 12:01:22 GMT
Title: BEA: Revisiting anchor-based object detection DNN using Budding Ensemble Architecture
Authors: Syed Sha Qutub and Neslihan Kose and Rafael Rosales and Michael Paulitsch and Korbinian Hagn and Florian Geissler and Yang Peng and Gereon Hinz and Alois Knoll
Abstract summary: Budding Ensemble Architecture (BEA) is a novel reduced ensemble architecture for anchor-based object detection models. The proposed loss functions in BEA improve the confidence score calibration and lower the uncertainty error.
Score: 8.736601342033431
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: This paper introduces the Budding Ensemble Architecture (BEA), a novel reduced ensemble architecture for anchor-based object detection models. Object detection models are crucial in vision-based tasks, particularly in autonomous systems. They should provide precise bounding box detections while also calibrating their predicted confidence scores, leading to higher-quality uncertainty estimates. However, current models may make erroneous decisions due to false positives receiving high scores or true positives being discarded due to low scores. BEA aims to address these issues. The proposed loss functions in BEA improve the confidence score calibration and lower the uncertainty error, which results in a better distinction of true and false positives and, eventually, higher accuracy of the object detection models. Both Base-YOLOv3 and SSD models were enhanced using the BEA method and its proposed loss functions. The BEA on Base-YOLOv3 trained on the KITTI dataset results in a 6% and 3.7% increase in mAP and AP50, respectively. Utilizing a well-balanced uncertainty estimation threshold to discard samples in real-time even leads to a 9.6% higher AP50 than its base model. This is attributed to a 40% increase in the area under the AP50-based retention curve used to measure the quality of calibration of confidence scores. Furthermore, BEA-YOLOV3 trained on KITTI provides superior out-of-distribution detection on Citypersons, BDD100K, and COCO datasets compared to the ensembles and vanilla models of YOLOv3 and Gaussian-YOLOv3.

Related papers

Bayesian Self-Distillation for Image Classification [6.446179861303341]
Supervised training of deep neural networks for classification typically relies on hard targets, which promote overconfidence and can limit calibration, robustness, and robustness.<n>Self-distillation methods aim to mitigate this by leveraging inter-class and sample-specific information present in the model's own predictions, but often remain dependent on hard targets, reducing their effectiveness.<n>We propose Bayesian Self-Distillation (BSD), a principled method for constructing sample-specific target distributions via Bayesian inference using the model's own predictions.<n>BSD consistently yields higher Expected Error (ECE) (-40%) than existing architecture-preserving self-
arXiv Detail & Related papers (2025-12-30T11:48:06Z)
Decomposing LLM Self-Correction: The Accuracy-Correction Paradox and Error Depth Hypothesis [6.901585308625979]
We decompose self-correction into three sub-capabilities: error detection, error localization, and error correction.<n>Our findings challenge linear assumptions about model capability and self-improvement.
arXiv Detail & Related papers (2025-12-24T21:51:24Z)
Do Large Language Models Know What They Don't Know? Kalshibench: A New Benchmark for Evaluating Epistemic Calibration via Prediction Markets [0.0]
A well-calibrated model should express confidence that matches its actual accuracy -- when it claims 80% confidence, it should be correct 80% of the time.<n>We introduce textbfKalshiBench, a benchmark of 300 prediction market questions from Kalshi, a CFTC-regulated exchange.<n>We evaluate five frontier models -- Claude Opus 4.5, GPT-5.2, DeepSeek-V3.2, Qwen3-235B, and Kimi-K2 -- and find textbfsystematic overconfidence across all models.
arXiv Detail & Related papers (2025-12-17T23:23:06Z)
LLMEval-3: A Large-Scale Longitudinal Study on Robust and Fair Evaluation of Large Language Models [51.55869466207234]
Existing evaluation of Large Language Models (LLMs) on static benchmarks is vulnerable to data contamination and leaderboard overfitting.<n>We introduce LLMEval-3, a framework for dynamic evaluation of LLMs.<n>LLEval-3 is built on a proprietary bank of 220k graduate-level questions, from which it dynamically samples unseen test sets for each evaluation run.
arXiv Detail & Related papers (2025-08-07T14:46:30Z)
RoHOI: Robustness Benchmark for Human-Object Interaction Detection [84.78366452133514]
Human-Object Interaction (HOI) detection is crucial for robot-human assistance, enabling context-aware support.<n>We introduce the first benchmark for HOI detection, evaluating model resilience under diverse challenges.<n>Our benchmark, RoHOI, includes 20 corruption types based on the HICO-DET and V-COCO datasets and a new robustness-focused metric.
arXiv Detail & Related papers (2025-07-12T01:58:04Z)
VADER: A Human-Evaluated Benchmark for Vulnerability Assessment, Detection, Explanation, and Remediation [0.8087612190556891]
VADER comprises 174 real-world software vulnerabilities, each carefully curated from GitHub and annotated by security experts.<n>For each vulnerability case, models are tasked with identifying the flaw, classifying it using Common Weaknession (CWE), explaining its underlying cause, proposing a patch, and formulating a test plan.<n>Using a one-shot prompting strategy, we benchmark six state-of-the-art LLMs (Claude 3.7 Sonnet, Gemini 2.5 Pro, GPT-4.1, GPT-4.5, Grok 3 Beta, and o3) on VADER.<n>Our results show that current state-of-the-
arXiv Detail & Related papers (2025-05-26T01:20:44Z)
Robust Fine-tuning of Zero-shot Models via Variance Reduction [56.360865951192324]
When fine-tuning zero-shot models, our desideratum is for the fine-tuned model to excel in both in-distribution (ID) and out-of-distribution (OOD) We propose a sample-wise ensembling technique that can simultaneously attain the best ID and OOD accuracy without the trade-offs.
arXiv Detail & Related papers (2024-11-11T13:13:39Z)
Uncertainty Estimation for 3D Object Detection via Evidential Learning [63.61283174146648]
We introduce a framework for quantifying uncertainty in 3D object detection by leveraging an evidential learning loss on Bird's Eye View representations in the 3D detector. We demonstrate both the efficacy and importance of these uncertainty estimates on identifying out-of-distribution scenes, poorly localized objects, and missing (false negative) detections.
arXiv Detail & Related papers (2024-10-31T13:13:32Z)
Benchmarking and Improving Bird's Eye View Perception Robustness in Autonomous Driving [55.93813178692077]
We present RoboBEV, an extensive benchmark suite designed to evaluate the resilience of BEV algorithms. We assess 33 state-of-the-art BEV-based perception models spanning tasks like detection, map segmentation, depth estimation, and occupancy prediction. Our experimental results also underline the efficacy of strategies like pre-training and depth-free BEV transformations in enhancing robustness against out-of-distribution data.
arXiv Detail & Related papers (2024-05-27T17:59:39Z)
Producing Plankton Classifiers that are Robust to Dataset Shift [1.716364772047407]
We integrate ZooLake dataset with manually-annotated images from 10 independent days of deployment to benchmark Out-Of-Dataset (OOD) performances. We propose a preemptive assessment method to identify potential pitfalls when classifying new data, and pinpoint features in OOD images that adversely impact classification. We find that ensembles of BEiT vision transformers, with targeted augmentations addressing OOD robustness, geometric ensembling, and rotation-based test-time augmentation, constitute the most robust model, which we call BEsT model.
arXiv Detail & Related papers (2024-01-25T15:47:18Z)
Towards Calibrated Robust Fine-Tuning of Vision-Language Models [97.19901765814431]
This work proposes a robust fine-tuning method that improves both OOD accuracy and confidence calibration simultaneously in vision language models. We show that both OOD classification and OOD calibration errors have a shared upper bound consisting of two terms of ID data. Based on this insight, we design a novel framework that conducts fine-tuning with a constrained multimodal contrastive loss enforcing a larger smallest singular value.
arXiv Detail & Related papers (2023-11-03T05:41:25Z)
A Computer Vision Enabled damage detection model with improved YOLOv5 based on Transformer Prediction Head [0.0]
Current state-of-the-art deep learning (DL)-based damage detection models often lack superior feature extraction capability in complex and noisy environments. DenseSPH-YOLOv5 is a real-time DL-based high-performance damage detection model where DenseNet blocks have been integrated with the backbone. DenseSPH-YOLOv5 obtains a mean average precision (mAP) value of 85.25 %, F1-score of 81.18 %, and precision (P) value of 89.51 % outperforming current state-of-the-art models.
arXiv Detail & Related papers (2023-03-07T22:53:36Z)
What Can We Learn From The Selective Prediction And Uncertainty Estimation Performance Of 523 Imagenet Classifiers [15.929238800072195]
We present a novel study of selective prediction and the uncertainty estimation performance of 523 existing pretrained deep ImageNet classifiers. We find that distillation-based training regimes consistently yield better uncertainty estimations than other training schemes. For example, we discovered an unprecedented 99% top-1 selective accuracy on ImageNet at 47% coverage.
arXiv Detail & Related papers (2023-02-23T09:25:28Z)
Improving Visual Grounding by Encouraging Consistent Gradient-based Explanations [58.442103936918805]
We show that Attention Mask Consistency produces superior visual grounding results than previous methods. AMC is effective, easy to implement, and is general as it can be adopted by any vision-language model.
arXiv Detail & Related papers (2022-06-30T17:55:12Z)
Localization Uncertainty-Based Attention for Object Detection [8.154943252001848]
We propose a more efficient uncertainty-aware dense detector (UADET) that predicts four-directional localization uncertainties via Gaussian modeling. Experiments using the MS COCO benchmark show that our UADET consistently surpasses baseline FCOS, and that our best model, ResNext-64x4d-101-DCN, obtains a single model, single-scale AP of 48.3% on COCO test-dev.
arXiv Detail & Related papers (2021-08-25T04:32:39Z)
Unlabelled Data Improves Bayesian Uncertainty Calibration under Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation. We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.