Related papers: Measuring and Mitigating Local Instability in Deep Neural Networks

Measuring and Mitigating Local Instability in Deep Neural Networks

URL: http://arxiv.org/abs/2305.10625v2
Date: Fri, 19 May 2023 01:45:10 GMT
Title: Measuring and Mitigating Local Instability in Deep Neural Networks
Authors: Arghya Datta, Subhrangshu Nandi, Jingcheng Xu, Greg Ver Steeg, He Xie, Anoop Kumar, Aram Galstyan
Abstract summary: We study how the predictions of a model change, even when it is retrained on the same data, as a consequence of principledity in the training process. For Natural Language Understanding (NLU) tasks, we find instability in predictions for a significant fraction of queries. We propose new data-centric methods that exploit our local stability estimates.
Score: 23.342675028217762
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Deep Neural Networks (DNNs) are becoming integral components of real world services relied upon by millions of users. Unfortunately, architects of these systems can find it difficult to ensure reliable performance as irrelevant details like random initialization can unexpectedly change the outputs of a trained system with potentially disastrous consequences. We formulate the model stability problem by studying how the predictions of a model change, even when it is retrained on the same data, as a consequence of stochasticity in the training process. For Natural Language Understanding (NLU) tasks, we find instability in predictions for a significant fraction of queries. We formulate principled metrics, like per-sample ``label entropy'' across training runs or within a single training run, to quantify this phenomenon. Intriguingly, we find that unstable predictions do not appear at random, but rather appear to be clustered in data-specific ways. We study data-agnostic regularization methods to improve stability and propose new data-centric methods that exploit our local stability estimates. We find that our localized data-specific mitigation strategy dramatically outperforms data-agnostic methods, and comes within 90% of the gold standard, achieved by ensembling, at a fraction of the computational cost

Related papers

Federated Learning with Sample-level Client Drift Mitigation [15.248811557566128]
Federated Learning suffers from severe performance degradation due to data heterogeneity among clients. We propose FedBSS that first mitigates the heterogeneity issue in a sample-level manner. We also achieved effective results on feature distribution and noise label dataset setting.
arXiv Detail & Related papers (2025-01-20T09:44:07Z)
Uncertainty Calibration with Energy Based Instance-wise Scaling in the Wild Dataset [23.155946032377052]
We introduce a novel instance-wise calibration method based on an energy model. Our method incorporates energy scores instead of softmax confidence scores, allowing for adaptive consideration of uncertainty. In experiments, we show that the proposed method consistently maintains robust performance across the spectrum.
arXiv Detail & Related papers (2024-07-17T06:14:55Z)
DRFLM: Distributionally Robust Federated Learning with Inter-client Noise via Local Mixup [58.894901088797376]
federated learning has emerged as a promising approach for training a global model using data from multiple organizations without leaking their raw data. We propose a general framework to solve the above two challenges simultaneously. We provide comprehensive theoretical analysis including robustness analysis, convergence analysis, and generalization ability.
arXiv Detail & Related papers (2022-04-16T08:08:29Z)
Resilient Neural Forecasting Systems [10.709321760368137]
Industrial machine learning systems face data challenges that are often under-explored in the academic literature. In this paper, we discuss data challenges and solutions in the context of a Neural Forecasting application on labor planning. We address changes in data distribution with a periodic retraining scheme and discuss the critical importance of model stability in this setting.
arXiv Detail & Related papers (2022-03-16T09:37:49Z)
Distributionally Robust Semi-Supervised Learning Over Graphs [68.29280230284712]
Semi-supervised learning (SSL) over graph-structured data emerges in many network science applications. To efficiently manage learning over graphs, variants of graph neural networks (GNNs) have been developed recently. Despite their success in practice, most of existing methods are unable to handle graphs with uncertain nodal attributes. Challenges also arise due to distributional uncertainties associated with data acquired by noisy measurements. A distributionally robust learning framework is developed, where the objective is to train models that exhibit quantifiable robustness against perturbations.
arXiv Detail & Related papers (2021-10-20T14:23:54Z)
Improving Uncertainty Calibration via Prior Augmented Data [56.88185136509654]
Neural networks have proven successful at learning from complex data distributions by acting as universal function approximators. They are often overconfident in their predictions, which leads to inaccurate and miscalibrated probabilistic predictions. We propose a solution by seeking out regions of feature space where the model is unjustifiably overconfident, and conditionally raising the entropy of those predictions towards that of the prior distribution of the labels.
arXiv Detail & Related papers (2021-02-22T07:02:37Z)
Learning while Respecting Privacy and Robustness to Distributional Uncertainties and Adversarial Data [66.78671826743884]
The distributionally robust optimization framework is considered for training a parametric model. The objective is to endow the trained model with robustness against adversarially manipulated input data. Proposed algorithms offer robustness with little overhead.
arXiv Detail & Related papers (2020-07-07T18:25:25Z)
Unlabelled Data Improves Bayesian Uncertainty Calibration under Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation. We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z)
Uncertainty Estimation Using a Single Deep Deterministic Neural Network [66.26231423824089]
We propose a method for training a deterministic deep model that can find and reject out of distribution data points at test time with a single forward pass. We scale training in these with a novel loss function and centroid updating scheme and match the accuracy of softmax models.
arXiv Detail & Related papers (2020-03-04T12:27:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.