Related papers: Is BatchEnsemble a Single Model? On Calibration and Diversity of Efficient Ensembles

Is BatchEnsemble a Single Model? On Calibration and Diversity of Efficient Ensembles

URL: http://arxiv.org/abs/2601.16936v1
Date: Fri, 23 Jan 2026 17:50:50 GMT
Title: Is BatchEnsemble a Single Model? On Calibration and Diversity of Efficient Ensembles
Authors: Anton Zamyatin, Patrick Indri, Sagar Malhotra, Thomas Gärtner,
Abstract summary: BatchEnsemble aims to deliver ensemble-like uncertainty (EU) EU at far lower parameter and memory cost.<n>We show that BatchEnsemble not only underperforms Deep Ensembles but closely tracks a single model baseline.
Score: 2.957223821964636
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In resource-constrained and low-latency settings, uncertainty estimates must be efficiently obtained. Deep Ensembles provide robust epistemic uncertainty (EU) but require training multiple full-size models. BatchEnsemble aims to deliver ensemble-like EU at far lower parameter and memory cost by applying learned rank-1 perturbations to a shared base network. We show that BatchEnsemble not only underperforms Deep Ensembles but closely tracks a single model baseline in terms of accuracy, calibration and out-of-distribution (OOD) detection on CIFAR10/10C/SVHN. A controlled study on MNIST finds members are near-identical in function and parameter space, indicating limited capacity to realize distinct predictive modes. Thus, BatchEnsemble behaves more like a single model than a true ensemble.

Related papers

DD-CAM: Minimal Sufficient Explanations for Vision Models Using Delta Debugging [1.7195886774107125]
We introduce a gradient-free framework for identifying minimal, sufficient, and decision-preserving explanations in vision models.<n>Our approach, DD-CAM, identifies a 1-minimal subset whose joint activation suffices to preserve a prediction.<n>We generate minimal, prediction-preserving saliency maps that highlight only the most essential features.
arXiv Detail & Related papers (2026-02-22T17:12:31Z)
Making Foundation Models Probabilistic via Singular Value Ensembles [56.4174499669573]
Foundation models have become a dominant paradigm in machine learning, achieving remarkable performance across diverse tasks through large-scale pretraining.<n>Standard approach to quantifying uncertainty, training an ensemble of independent models, incurs prohibitive computational costs that scale linearly with ensemble size.<n>We propose Singular Value Ensemble (SVE), a parameter-efficient implicit ensemble method that builds on a simple, but powerful core assumption.<n>We show that SVE uncertainty quantification achieves comparable to explicit deep ensembles while increasing the parameter count of the base model by less than 1%.
arXiv Detail & Related papers (2026-01-29T18:07:18Z)
LoRA-Ensemble: Efficient Uncertainty Modelling for Self-Attention Networks [52.46420522934253]
We introduce LoRA-Ensemble, a parameter-efficient ensembling method for self-attention networks.<n>The method not only outperforms state-of-the-art implicit techniques like BatchEnsemble, but even matches or exceeds the accuracy of an Explicit Ensemble.
arXiv Detail & Related papers (2024-05-23T11:10:32Z)
Towards Continual Learning Desiderata via HSIC-Bottleneck Orthogonalization and Equiangular Embedding [55.107555305760954]
We propose a conceptually simple yet effective method that attributes forgetting to layer-wise parameter overwriting and the resulting decision boundary distortion. Our method achieves competitive accuracy performance, even with absolute superiority of zero exemplar buffer and 1.02x the base model.
arXiv Detail & Related papers (2024-01-17T09:01:29Z)
Split-Ensemble: Efficient OOD-aware Ensemble via Task and Model Splitting [42.59091710435927]
Uncertainty estimation is crucial for machine learning models to detect out-of-distribution (OOD) inputs. In this work, we improve on uncertainty estimation without extra OOD data or additional inference costs using an alternative Split-Ensemble method.
arXiv Detail & Related papers (2023-12-14T17:18:44Z)
Automatic Mixed-Precision Quantization Search of BERT [62.65905462141319]
Pre-trained language models such as BERT have shown remarkable effectiveness in various natural language processing tasks. These models usually contain millions of parameters, which prevents them from practical deployment on resource-constrained devices. We propose an automatic mixed-precision quantization framework designed for BERT that can simultaneously conduct quantization and pruning in a subgroup-wise level.
arXiv Detail & Related papers (2021-12-30T06:32:47Z)
Sparse MoEs meet Efficient Ensembles [49.313497379189315]
We study the interplay of two popular classes of such models: ensembles of neural networks and sparse mixture of experts (sparse MoEs) We present Efficient Ensemble of Experts (E$3$), a scalable and simple ensemble of sparse MoEs that takes the best of both classes of models, while using up to 45% fewer FLOPs than a deep ensemble.
arXiv Detail & Related papers (2021-10-07T11:58:35Z)
Search What You Want: Barrier Panelty NAS for Mixed Precision Quantization [51.26579110596767]
We propose a novel Barrier Penalty based NAS (BP-NAS) for mixed precision quantization. BP-NAS sets new state of the arts on both classification (Cifar-10, ImageNet) and detection (COCO)
arXiv Detail & Related papers (2020-07-20T12:00:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.