Robust Nonparametric Hypothesis Testing to Understand Variability in
Training Neural Networks
- URL: http://arxiv.org/abs/2310.00541v1
- Date: Sun, 1 Oct 2023 01:44:35 GMT
- Title: Robust Nonparametric Hypothesis Testing to Understand Variability in
Training Neural Networks
- Authors: Sinjini Banerjee, Reilly Cannon, Tim Marrinan, Tony Chiang, Anand D.
Sarwate
- Abstract summary: We propose a new measure of closeness between classification models based on the output of the network before thresholding.
Our measure is based on a robust hypothesis-testing framework and can be adapted to other quantities derived from trained models.
- Score: 5.8490454659691355
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Training a deep neural network (DNN) often involves stochastic optimization,
which means each run will produce a different model. Several works suggest this
variability is negligible when models have the same performance, which in the
case of classification is test accuracy. However, models with similar test
accuracy may not be computing the same function. We propose a new measure of
closeness between classification models based on the output of the network
before thresholding. Our measure is based on a robust hypothesis-testing
framework and can be adapted to other quantities derived from trained models.
Related papers
- Transferable Post-training via Inverse Value Learning [83.75002867411263]
We propose modeling changes at the logits level during post-training using a separate neural network (i.e., the value network)
After training this network on a small base model using demonstrations, this network can be seamlessly integrated with other pre-trained models during inference.
We demonstrate that the resulting value network has broad transferability across pre-trained models of different parameter sizes.
arXiv Detail & Related papers (2024-10-28T13:48:43Z) - Structured Radial Basis Function Network: Modelling Diversity for
Multiple Hypotheses Prediction [51.82628081279621]
Multi-modal regression is important in forecasting nonstationary processes or with a complex mixture of distributions.
A Structured Radial Basis Function Network is presented as an ensemble of multiple hypotheses predictors for regression problems.
It is proved that this structured model can efficiently interpolate this tessellation and approximate the multiple hypotheses target distribution.
arXiv Detail & Related papers (2023-09-02T01:27:53Z) - Demystifying Randomly Initialized Networks for Evaluating Generative
Models [28.8899914083501]
Evaluation of generative models is mostly based on the comparison between the estimated distribution and the ground truth distribution in a certain feature space.
To embed samples into informative features, previous works often use convolutional neural networks optimized for classification.
In this paper, we rigorously investigate the feature space of models with random weights in comparison to that of trained models.
arXiv Detail & Related papers (2022-08-19T08:43:53Z) - How to Combine Variational Bayesian Networks in Federated Learning [0.0]
Federated learning enables multiple data centers to train a central model collaboratively without exposing any confidential data.
deterministic models are capable of performing high prediction accuracy, their lack of calibration and capability to quantify uncertainty is problematic for safety-critical applications.
We study the effects of various aggregation schemes for variational Bayesian neural networks.
arXiv Detail & Related papers (2022-06-22T07:53:12Z) - On the Prediction Instability of Graph Neural Networks [2.3605348648054463]
Instability of trained models can affect reliability, reliability, and trust in machine learning systems.
We systematically assess the prediction instability of node classification with state-of-the-art Graph Neural Networks (GNNs)
We find that up to one third of the incorrectly classified nodes differ across algorithm runs.
arXiv Detail & Related papers (2022-05-20T10:32:59Z) - Conformal prediction for the design problem [72.14982816083297]
In many real-world deployments of machine learning, we use a prediction algorithm to choose what data to test next.
In such settings, there is a distinct type of distribution shift between the training and test data.
We introduce a method to quantify predictive uncertainty in such settings.
arXiv Detail & Related papers (2022-02-08T02:59:12Z) - MEMO: Test Time Robustness via Adaptation and Augmentation [131.28104376280197]
We study the problem of test time robustification, i.e., using the test input to improve model robustness.
Recent prior works have proposed methods for test time adaptation, however, they each introduce additional assumptions.
We propose a simple approach that can be used in any test setting where the model is probabilistic and adaptable.
arXiv Detail & Related papers (2021-10-18T17:55:11Z) - Training on Test Data with Bayesian Adaptation for Covariate Shift [96.3250517412545]
Deep neural networks often make inaccurate predictions with unreliable uncertainty estimates.
We derive a Bayesian model that provides for a well-defined relationship between unlabeled inputs under distributional shift and model parameters.
We show that our method improves both accuracy and uncertainty estimation.
arXiv Detail & Related papers (2021-09-27T01:09:08Z) - Firearm Detection via Convolutional Neural Networks: Comparing a
Semantic Segmentation Model Against End-to-End Solutions [68.8204255655161]
Threat detection of weapons and aggressive behavior from live video can be used for rapid detection and prevention of potentially deadly incidents.
One way for achieving this is through the use of artificial intelligence and, in particular, machine learning for image analysis.
We compare a traditional monolithic end-to-end deep learning model and a previously proposed model based on an ensemble of simpler neural networks detecting fire-weapons via semantic segmentation.
arXiv Detail & Related papers (2020-12-17T15:19:29Z) - Incremental Unsupervised Domain-Adversarial Training of Neural Networks [17.91571291302582]
In the context of supervised statistical learning, it is typically assumed that the training set comes from the same distribution that draws the test samples.
Here we take a different avenue and approach the problem from an incremental point of view, where the model is adapted to the new domain iteratively.
Our results report a clear improvement with respect to the non-incremental case in several datasets, also outperforming other state-of-the-art domain adaptation algorithms.
arXiv Detail & Related papers (2020-01-13T09:54:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.