IKD+: Reliable Low Complexity Deep Models For Retinopathy Classification
- URL: http://arxiv.org/abs/2303.02310v1
- Date: Sat, 4 Mar 2023 03:59:06 GMT
- Title: IKD+: Reliable Low Complexity Deep Models For Retinopathy Classification
- Authors: Shreyas Bhat Brahmavar, Rohit Rajesh, Tirtharaj Dash, Lovekesh Vig,
Tanmay Tulsidas Verlekar, Md Mahmudul Hasan, Tariq Khan, Erik Meijering,
Ashwin Srinivasan
- Abstract summary: Deep neural network (DNN) models for retinopathy have estimated predictive accuracies in the mid-to-high 90%.
State-of-the-art models are complex and require substantial computational infrastructure to train and deploy.
We propose a form of iterative knowledge distillation(IKD), called IKD+ that incorporates a tradeoff between size, accuracy and reliability.
- Score: 15.543363807730096
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep neural network (DNN) models for retinopathy have estimated predictive
accuracies in the mid-to-high 90%. However, the following aspects remain
unaddressed: State-of-the-art models are complex and require substantial
computational infrastructure to train and deploy; The reliability of
predictions can vary widely. In this paper, we focus on these aspects and
propose a form of iterative knowledge distillation(IKD), called IKD+ that
incorporates a tradeoff between size, accuracy and reliability. We investigate
the functioning of IKD+ using two widely used techniques for estimating model
calibration (Platt-scaling and temperature-scaling), using the best-performing
model available, which is an ensemble of EfficientNets with approximately 100M
parameters. We demonstrate that IKD+ equipped with temperature-scaling results
in models that show up to approximately 500-fold decreases in the number of
parameters than the original ensemble without a significant loss in accuracy.
In addition, calibration scores (reliability) for the IKD+ models are as good
as or better than the base mode
Related papers
- Robust Fine-tuning of Zero-shot Models via Variance Reduction [56.360865951192324]
When fine-tuning zero-shot models, our desideratum is for the fine-tuned model to excel in both in-distribution (ID) and out-of-distribution (OOD)
We propose a sample-wise ensembling technique that can simultaneously attain the best ID and OOD accuracy without the trade-offs.
arXiv Detail & Related papers (2024-11-11T13:13:39Z) - ContactNet: Geometric-Based Deep Learning Model for Predicting Protein-Protein Interactions [2.874893537471256]
We develop a novel attention-based Graph Neural Network (GNN), ContactNet, for classifying PPI models into accurate and incorrect ones.
When trained on docked antigen and modeled antibody structures, ContactNet doubles the accuracy of current state-of-the-art scoring functions.
arXiv Detail & Related papers (2024-06-26T12:54:41Z) - Fast Cell Library Characterization for Design Technology Co-Optimization Based on Graph Neural Networks [0.1752969190744922]
Design technology co-optimization (DTCO) plays a critical role in achieving optimal power, performance, and area.
We propose a graph neural network (GNN)-based machine learning model for rapid and accurate cell library characterization.
arXiv Detail & Related papers (2023-12-20T06:10:27Z) - Comparative Analysis of Epileptic Seizure Prediction: Exploring Diverse
Pre-Processing Techniques and Machine Learning Models [0.0]
We present a comparative analysis of five machine learning models for the prediction of epileptic seizures using EEG data.
The results of our analysis demonstrate the performance of each model in terms of accuracy.
The ET model exhibited the best performance with an accuracy of 99.29%.
arXiv Detail & Related papers (2023-08-06T08:50:08Z) - Continuous time recurrent neural networks: overview and application to
forecasting blood glucose in the intensive care unit [56.801856519460465]
Continuous time autoregressive recurrent neural networks (CTRNNs) are a deep learning model that account for irregular observations.
We demonstrate the application of these models to probabilistic forecasting of blood glucose in a critical care setting.
arXiv Detail & Related papers (2023-04-14T09:39:06Z) - Model soups: averaging weights of multiple fine-tuned models improves
accuracy without increasing inference time [69.7693300927423]
We show that averaging the weights of multiple models fine-tuned with different hyper parameter configurations improves accuracy and robustness.
We show that the model soup approach extends to multiple image classification and natural language processing tasks.
arXiv Detail & Related papers (2022-03-10T17:03:49Z) - Enhanced physics-constrained deep neural networks for modeling vanadium
redox flow battery [62.997667081978825]
We propose an enhanced version of the physics-constrained deep neural network (PCDNN) approach to provide high-accuracy voltage predictions.
The ePCDNN can accurately capture the voltage response throughout the charge--discharge cycle, including the tail region of the voltage discharge curve.
arXiv Detail & Related papers (2022-03-03T19:56:24Z) - Semi-supervised teacher-student deep neural network for materials
discovery [6.333015476935593]
We propose a semi-supervised deep neural network (TSDNN) model for high-performance formation energy and synthesizability prediction.
For formation energy based stability screening, our model achieves an absolute 10.3% accuracy improvement compared to the baseline CGCNN regression model.
For synthesizability prediction, our model significantly increases the baseline PU learning's true positive rate from 87.9% to 97.9% using 1/49 model parameters.
arXiv Detail & Related papers (2021-12-12T04:00:21Z) - MixKD: Towards Efficient Distillation of Large-scale Language Models [129.73786264834894]
We propose MixKD, a data-agnostic distillation framework, to endow the resulting model with stronger generalization ability.
We prove from a theoretical perspective that under reasonable conditions MixKD gives rise to a smaller gap between the error and the empirical error.
Experiments under a limited-data setting and ablation studies further demonstrate the advantages of the proposed approach.
arXiv Detail & Related papers (2020-11-01T18:47:51Z) - EMPIR: Ensembles of Mixed Precision Deep Networks for Increased
Robustness against Adversarial Attacks [18.241639570479563]
Deep Neural Networks (DNNs) are vulnerable to adversarial attacks in which small input perturbations can produce catastrophic misclassifications.
We propose EMPIR, ensembles of quantized DNN models with different numerical precisions, as a new approach to increase robustness against adversarial attacks.
Our results indicate that EMPIR boosts the average adversarial accuracies by 42.6%, 15.2% and 10.5% for the DNN models trained on the MNIST, CIFAR-10 and ImageNet datasets respectively.
arXiv Detail & Related papers (2020-04-21T17:17:09Z) - Calibrating Deep Neural Networks using Focal Loss [77.92765139898906]
Miscalibration is a mismatch between a model's confidence and its correctness.
We show that focal loss allows us to learn models that are already very well calibrated.
We show that our approach achieves state-of-the-art calibration without compromising on accuracy in almost all cases.
arXiv Detail & Related papers (2020-02-21T17:35:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.