Benchmarking Bayesian Deep Learning on Diabetic Retinopathy Detection
Tasks
- URL: http://arxiv.org/abs/2211.12717v1
- Date: Wed, 23 Nov 2022 05:44:42 GMT
- Title: Benchmarking Bayesian Deep Learning on Diabetic Retinopathy Detection
Tasks
- Authors: Neil Band, Tim G. J. Rudner, Qixuan Feng, Angelos Filos, Zachary Nado,
Michael W. Dusenberry, Ghassen Jerfel, Dustin Tran, Yarin Gal
- Abstract summary: We propose a set of real-world tasks that accurately reflect such complexities and are designed to assess the reliability of predictive models in safety-critical scenarios.
Specifically, we curate two publicly available datasets of high-resolution human retina images exhibiting varying degrees of diabetic retinopathy, a medical condition that can lead to blindness.
We use these tasks to benchmark well-established and state-of-the-art Bayesian deep learning methods on task-specific evaluation metrics.
- Score: 39.82245729848194
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Bayesian deep learning seeks to equip deep neural networks with the ability
to precisely quantify their predictive uncertainty, and has promised to make
deep learning more reliable for safety-critical real-world applications. Yet,
existing Bayesian deep learning methods fall short of this promise; new methods
continue to be evaluated on unrealistic test beds that do not reflect the
complexities of downstream real-world tasks that would benefit most from
reliable uncertainty quantification. We propose the RETINA Benchmark, a set of
real-world tasks that accurately reflect such complexities and are designed to
assess the reliability of predictive models in safety-critical scenarios.
Specifically, we curate two publicly available datasets of high-resolution
human retina images exhibiting varying degrees of diabetic retinopathy, a
medical condition that can lead to blindness, and use them to design a suite of
automated diagnosis tasks that require reliable predictive uncertainty
quantification. We use these tasks to benchmark well-established and
state-of-the-art Bayesian deep learning methods on task-specific evaluation
metrics. We provide an easy-to-use codebase for fast and easy benchmarking
following reproducibility and software design principles. We provide
implementations of all methods included in the benchmark as well as results
computed over 100 TPU days, 20 GPU days, 400 hyperparameter configurations, and
evaluation on at least 6 random seeds each.
Related papers
- SURE: SUrvey REcipes for building reliable and robust deep networks [12.268921703825258]
In this paper, we revisit techniques for uncertainty estimation within deep neural networks and consolidate a suite of techniques to enhance their reliability.
We rigorously evaluate SURE against the benchmark of failure prediction, a critical testbed for uncertainty estimation efficacy.
When applied to real-world challenges, such as data corruption, label noise, and long-tailed class distribution, SURE exhibits remarkable robustness, delivering results that are superior or on par with current state-of-the-art specialized methods.
arXiv Detail & Related papers (2024-03-01T13:58:19Z) - The Boundaries of Verifiable Accuracy, Robustness, and Generalisation in Deep Learning [71.14237199051276]
We consider classical distribution-agnostic framework and algorithms minimising empirical risks.
We show that there is a large family of tasks for which computing and verifying ideal stable and accurate neural networks is extremely challenging.
arXiv Detail & Related papers (2023-09-13T16:33:27Z) - U-PASS: an Uncertainty-guided deep learning Pipeline for Automated Sleep
Staging [61.6346401960268]
We propose a machine learning pipeline called U-PASS tailored for clinical applications that incorporates uncertainty estimation at every stage of the process.
We apply our uncertainty-guided deep learning pipeline to the challenging problem of sleep staging and demonstrate that it systematically improves performance at every stage.
arXiv Detail & Related papers (2023-06-07T08:27:36Z) - DUDES: Deep Uncertainty Distillation using Ensembles for Semantic
Segmentation [11.099838952805325]
Quantifying the predictive uncertainty is a promising endeavour to open up the use of deep neural networks for such applications.
We present a novel approach for efficient and reliable uncertainty estimation which we call Deep Uncertainty Distillation using Ensembles (DUDES)
DUDES applies student-teacher distillation with a Deep Ensemble to accurately approximate predictive uncertainties with a single forward pass.
arXiv Detail & Related papers (2023-03-17T08:56:27Z) - BayesCap: Bayesian Identity Cap for Calibrated Uncertainty in Frozen
Neural Networks [50.15201777970128]
We propose BayesCap that learns a Bayesian identity mapping for the frozen model, allowing uncertainty estimation.
BayesCap is a memory-efficient method that can be trained on a small fraction of the original dataset.
We show the efficacy of our method on a wide variety of tasks with a diverse set of architectures.
arXiv Detail & Related papers (2022-07-14T12:50:09Z) - A high performance fingerprint liveness detection method based on
quality related features [66.41574316136379]
The system is tested on a highly challenging database comprising over 10,500 real and fake images.
The proposed solution proves to be robust to the multi-scenario dataset, and presents an overall rate of 90% correctly classified samples.
arXiv Detail & Related papers (2021-11-02T21:09:39Z) - On the Practicality of Deterministic Epistemic Uncertainty [106.06571981780591]
deterministic uncertainty methods (DUMs) achieve strong performance on detecting out-of-distribution data.
It remains unclear whether DUMs are well calibrated and can seamlessly scale to real-world applications.
arXiv Detail & Related papers (2021-07-01T17:59:07Z) - Multi-Loss Sub-Ensembles for Accurate Classification with Uncertainty
Estimation [1.2891210250935146]
We propose an efficient method for uncertainty estimation in deep neural networks (DNNs) achieving high accuracy.
We keep our inference time relatively low by leveraging the advantage proposed by the Deep-Sub-Ensembles method.
Our results show improved accuracy on the classification task and competitive results on several uncertainty measures.
arXiv Detail & Related papers (2020-10-05T10:59:11Z) - Data-Driven Assessment of Deep Neural Networks with Random Input
Uncertainty [14.191310794366075]
We develop a data-driven optimization-based method capable of simultaneously certifying the safety of network outputs and localizing them.
We experimentally demonstrate the efficacy and tractability of the method on a deep ReLU network.
arXiv Detail & Related papers (2020-10-02T19:13:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.