Measuring and Mitigating Local Instability in Deep Neural Networks
- URL: http://arxiv.org/abs/2305.10625v2
- Date: Fri, 19 May 2023 01:45:10 GMT
- Title: Measuring and Mitigating Local Instability in Deep Neural Networks
- Authors: Arghya Datta, Subhrangshu Nandi, Jingcheng Xu, Greg Ver Steeg, He Xie,
Anoop Kumar, Aram Galstyan
- Abstract summary: We study how the predictions of a model change, even when it is retrained on the same data, as a consequence of principledity in the training process.
For Natural Language Understanding (NLU) tasks, we find instability in predictions for a significant fraction of queries.
We propose new data-centric methods that exploit our local stability estimates.
- Score: 23.342675028217762
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep Neural Networks (DNNs) are becoming integral components of real world
services relied upon by millions of users. Unfortunately, architects of these
systems can find it difficult to ensure reliable performance as irrelevant
details like random initialization can unexpectedly change the outputs of a
trained system with potentially disastrous consequences. We formulate the model
stability problem by studying how the predictions of a model change, even when
it is retrained on the same data, as a consequence of stochasticity in the
training process. For Natural Language Understanding (NLU) tasks, we find
instability in predictions for a significant fraction of queries. We formulate
principled metrics, like per-sample ``label entropy'' across training runs or
within a single training run, to quantify this phenomenon. Intriguingly, we
find that unstable predictions do not appear at random, but rather appear to be
clustered in data-specific ways. We study data-agnostic regularization methods
to improve stability and propose new data-centric methods that exploit our
local stability estimates. We find that our localized data-specific mitigation
strategy dramatically outperforms data-agnostic methods, and comes within 90%
of the gold standard, achieved by ensembling, at a fraction of the
computational cost
Related papers
- Uncertainty Calibration with Energy Based Instance-wise Scaling in the Wild Dataset [23.155946032377052]
We introduce a novel instance-wise calibration method based on an energy model.
Our method incorporates energy scores instead of softmax confidence scores, allowing for adaptive consideration of uncertainty.
In experiments, we show that the proposed method consistently maintains robust performance across the spectrum.
arXiv Detail & Related papers (2024-07-17T06:14:55Z) - DRFLM: Distributionally Robust Federated Learning with Inter-client
Noise via Local Mixup [58.894901088797376]
federated learning has emerged as a promising approach for training a global model using data from multiple organizations without leaking their raw data.
We propose a general framework to solve the above two challenges simultaneously.
We provide comprehensive theoretical analysis including robustness analysis, convergence analysis, and generalization ability.
arXiv Detail & Related papers (2022-04-16T08:08:29Z) - Resilient Neural Forecasting Systems [10.709321760368137]
Industrial machine learning systems face data challenges that are often under-explored in the academic literature.
In this paper, we discuss data challenges and solutions in the context of a Neural Forecasting application on labor planning.
We address changes in data distribution with a periodic retraining scheme and discuss the critical importance of model stability in this setting.
arXiv Detail & Related papers (2022-03-16T09:37:49Z) - Distributionally Robust Semi-Supervised Learning Over Graphs [68.29280230284712]
Semi-supervised learning (SSL) over graph-structured data emerges in many network science applications.
To efficiently manage learning over graphs, variants of graph neural networks (GNNs) have been developed recently.
Despite their success in practice, most of existing methods are unable to handle graphs with uncertain nodal attributes.
Challenges also arise due to distributional uncertainties associated with data acquired by noisy measurements.
A distributionally robust learning framework is developed, where the objective is to train models that exhibit quantifiable robustness against perturbations.
arXiv Detail & Related papers (2021-10-20T14:23:54Z) - Improving Uncertainty Calibration via Prior Augmented Data [56.88185136509654]
Neural networks have proven successful at learning from complex data distributions by acting as universal function approximators.
They are often overconfident in their predictions, which leads to inaccurate and miscalibrated probabilistic predictions.
We propose a solution by seeking out regions of feature space where the model is unjustifiably overconfident, and conditionally raising the entropy of those predictions towards that of the prior distribution of the labels.
arXiv Detail & Related papers (2021-02-22T07:02:37Z) - Learning while Respecting Privacy and Robustness to Distributional
Uncertainties and Adversarial Data [66.78671826743884]
The distributionally robust optimization framework is considered for training a parametric model.
The objective is to endow the trained model with robustness against adversarially manipulated input data.
Proposed algorithms offer robustness with little overhead.
arXiv Detail & Related papers (2020-07-07T18:25:25Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z) - Uncertainty Estimation Using a Single Deep Deterministic Neural Network [66.26231423824089]
We propose a method for training a deterministic deep model that can find and reject out of distribution data points at test time with a single forward pass.
We scale training in these with a novel loss function and centroid updating scheme and match the accuracy of softmax models.
arXiv Detail & Related papers (2020-03-04T12:27:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.