Related papers: Extracting Uncertainty Estimates from Mixtures of Experts for Semantic Segmentation

Extracting Uncertainty Estimates from Mixtures of Experts for Semantic Segmentation

URL: http://arxiv.org/abs/2509.04816v1
Date: Fri, 05 Sep 2025 05:30:53 GMT
Title: Extracting Uncertainty Estimates from Mixtures of Experts for Semantic Segmentation
Authors: Svetlana Pavlitska, Beyza Keskin, Alwin Faßbender, Christian Hubschneider, J. Marius Zöllner,
Abstract summary: We show that well-calibrated predictive uncertainty estimates can be extracted from a mixture of experts (MoE) without architectural modifications.<n>Our results show that MoEs yield more reliable uncertainty estimates than ensembles in terms of conditional correctness metrics.<n>Our experiments on the Cityscapes dataset suggest that increasing the number of experts can further enhance uncertainty calibration.
Score: 9.817102014355617
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Estimating accurate and well-calibrated predictive uncertainty is important for enhancing the reliability of computer vision models, especially in safety-critical applications like traffic scene perception. While ensemble methods are commonly used to quantify uncertainty by combining multiple models, a mixture of experts (MoE) offers an efficient alternative by leveraging a gating network to dynamically weight expert predictions based on the input. Building on the promising use of MoEs for semantic segmentation in our previous works, we show that well-calibrated predictive uncertainty estimates can be extracted from MoEs without architectural modifications. We investigate three methods to extract predictive uncertainty estimates: predictive entropy, mutual information, and expert variance. We evaluate these methods for an MoE with two experts trained on a semantical split of the A2D2 dataset. Our results show that MoEs yield more reliable uncertainty estimates than ensembles in terms of conditional correctness metrics under out-of-distribution (OOD) data. Additionally, we evaluate routing uncertainty computed via gate entropy and find that simple gating mechanisms lead to better calibration of routing uncertainty estimates than more complex classwise gates. Finally, our experiments on the Cityscapes dataset suggest that increasing the number of experts can further enhance uncertainty calibration. Our code is available at https://github.com/KASTEL-MobilityLab/mixtures-of-experts/.

Related papers

Bayesian Mixture of Experts For Large Language Models [2.889541910837398]
We present a post-hoc uncertainty estimation framework for large language models (LLMs) based on Mixture-of-Experts architectures.<n>Bayesian-MoE applies a structured Laplace approximation to the second linear layer of each expert, enabling calibrated uncertainty estimation.<n> Experiments on common-sense reasoning benchmarks with Qwen1.5-MoE and DeepSeek-MoE demonstrate that Bayesian-MoE improves both expected calibration error (ECE) and negative log-likelihood (NLL) over baselines.
arXiv Detail & Related papers (2025-11-12T04:24:20Z)
FIVA: Federated Inverse Variance Averaging for Universal CT Segmentation with Uncertainty Estimation [4.544160712377809]
This work presents a novel federated learning approach to achieve universal segmentation across diverse abdominal CT datasets.<n>The proposed method quantifies prediction uncertainty by propagating the uncertainty from the model weights.<n> Experimental evaluations demonstrate the effectiveness of this approach in improving the quality of federated aggregation and uncertainty-weighted inference.
arXiv Detail & Related papers (2025-08-08T11:34:01Z)
SConU: Selective Conformal Uncertainty in Large Language Models [59.25881667640868]
We propose a novel approach termed Selective Conformal Uncertainty (SConU)<n>We develop two conformal p-values that are instrumental in determining whether a given sample deviates from the uncertainty distribution of the calibration set at a specific manageable risk level.<n>Our approach not only facilitates rigorous management of miscoverage rates across both single-domain and interdisciplinary contexts, but also enhances the efficiency of predictions.
arXiv Detail & Related papers (2025-04-19T03:01:45Z)
Training of Neural Networks with Uncertain Data: A Mixture of Experts Approach [0.0]
"Uncertainty-aware Mixture of Experts" (uMoE) is a novel solution aimed at addressing aleatoric uncertainty within Neural Network (NN) based predictive models. Our findings demonstrate the superior performance of uMoE over baseline methods in effectively managing data uncertainty. This innovative approach boasts broad applicability across diverse da-ta-driven domains, including but not limited to biomedical signal processing, autonomous driving, and production quality control.
arXiv Detail & Related papers (2023-12-13T11:57:15Z)
Quantification of Predictive Uncertainty via Inference-Time Sampling [57.749601811982096]
We propose a post-hoc sampling strategy for estimating predictive uncertainty accounting for data ambiguity. The method can generate different plausible outputs for a given input and does not assume parametric forms of predictive distributions.
arXiv Detail & Related papers (2023-08-03T12:43:21Z)
Uncertainty Quantification for Traffic Forecasting: A Unified Approach [21.556559649467328]
Uncertainty is an essential consideration for time series forecasting tasks. In this work, we focus on quantifying the uncertainty of traffic forecasting. We develop Deep S-Temporal Uncertainty Quantification (STUQ), which can estimate both aleatoric and relational uncertainty.
arXiv Detail & Related papers (2022-08-11T15:21:53Z)
Collaborative Uncertainty Benefits Multi-Agent Multi-Modal Trajectory Forecasting [61.02295959343446]
This work first proposes a novel concept, collaborative uncertainty (CU), which models the uncertainty resulting from interaction modules.<n>We build a general CU-aware regression framework with an original permutation-equivariant uncertainty estimator to do both tasks of regression and uncertainty estimation.<n>We apply the proposed framework to current SOTA multi-agent trajectory forecasting systems as a plugin module.
arXiv Detail & Related papers (2022-07-11T21:17:41Z)
Uncertainty Estimation for Heatmap-based Landmark Localization [4.673063715963989]
We propose Quantile Binning, a data-driven method to categorise predictions by uncertainty with estimated error bounds. We demonstrate this framework by comparing and contrasting three uncertainty measures. We conclude by illustrating how filtering out gross mispredictions caught in our Quantile Bins significantly improves the proportion of predictions under an acceptable error threshold.
arXiv Detail & Related papers (2022-03-04T14:40:44Z)
Gradient-Based Quantification of Epistemic Uncertainty for Deep Object Detectors [8.029049649310213]
We introduce novel gradient-based uncertainty metrics and investigate them for different object detection architectures. Experiments show significant improvements in true positive / false positive discrimination and prediction of intersection over union. We also find improvement over Monte-Carlo dropout uncertainty metrics and further significant boosts by aggregating different sources of uncertainty metrics.
arXiv Detail & Related papers (2021-07-09T16:04:11Z)
Discriminative Jackknife: Quantifying Uncertainty in Deep Learning via Higher-Order Influence Functions [121.10450359856242]
We develop a frequentist procedure that utilizes influence functions of a model's loss functional to construct a jackknife (or leave-one-out) estimator of predictive confidence intervals. The DJ satisfies (1) and (2), is applicable to a wide range of deep learning models, is easy to implement, and can be applied in a post-hoc fashion without interfering with model training or compromising its accuracy.
arXiv Detail & Related papers (2020-06-29T13:36:52Z)
Efficient Ensemble Model Generation for Uncertainty Estimation with Bayesian Approximation in Segmentation [74.06904875527556]
We propose a generic and efficient segmentation framework to construct ensemble segmentation models. In the proposed method, ensemble models can be efficiently generated by using the layer selection method. We also devise a new pixel-wise uncertainty loss, which improves the predictive performance.
arXiv Detail & Related papers (2020-05-21T16:08:38Z)
Uncertainty Estimation Using a Single Deep Deterministic Neural Network [66.26231423824089]
We propose a method for training a deterministic deep model that can find and reject out of distribution data points at test time with a single forward pass. We scale training in these with a novel loss function and centroid updating scheme and match the accuracy of softmax models.
arXiv Detail & Related papers (2020-03-04T12:27:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.