Quantifying Overfitting: Evaluating Neural Network Performance through
Analysis of Null Space
- URL: http://arxiv.org/abs/2305.19424v1
- Date: Tue, 30 May 2023 21:31:24 GMT
- Title: Quantifying Overfitting: Evaluating Neural Network Performance through
Analysis of Null Space
- Authors: Hossein Rezaei, Mohammad Sabokrou
- Abstract summary: We analyze the null space in the last layer of neural networks to quantify overfitting without access to training data or knowledge of the accuracy of those data.
Our work represents the first attempt to quantify overfitting without access to training data or knowing any knowledge about the training samples.
- Score: 10.698553177585973
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning models that are overfitted/overtrained are more vulnerable
to knowledge leakage, which poses a risk to privacy. Suppose we download or
receive a model from a third-party collaborator without knowing its training
accuracy. How can we determine if it has been overfitted or overtrained on its
training data? It's possible that the model was intentionally over-trained to
make it vulnerable during testing. While an overfitted or overtrained model may
perform well on testing data and even some generalization tests, we can't be
sure it's not over-fitted. Conducting a comprehensive generalization test is
also expensive. The goal of this paper is to address these issues and ensure
the privacy and generalization of our method using only testing data. To
achieve this, we analyze the null space in the last layer of neural networks,
which enables us to quantify overfitting without access to training data or
knowledge of the accuracy of those data. We evaluated our approach on various
architectures and datasets and observed a distinct pattern in the angle of null
space when models are overfitted. Furthermore, we show that models with poor
generalization exhibit specific characteristics in this space. Our work
represents the first attempt to quantify overfitting without access to training
data or knowing any knowledge about the training samples.
Related papers
- Adaptive Pre-training Data Detection for Large Language Models via Surprising Tokens [1.2549198550400134]
Large language models (LLMs) are extensively used, but there are concerns regarding privacy, security, and copyright due to their opaque training data.
Current solutions to this problem leverage techniques explored in machine learning privacy such as Membership Inference Attacks (MIAs)
We propose an adaptive pre-training data detection method which alleviates this reliance and effectively amplify the identification.
arXiv Detail & Related papers (2024-07-30T23:43:59Z) - Fantastic Gains and Where to Find Them: On the Existence and Prospect of
General Knowledge Transfer between Any Pretrained Model [74.62272538148245]
We show that for arbitrary pairings of pretrained models, one model extracts significant data context unavailable in the other.
We investigate if it is possible to transfer such "complementary" knowledge from one model to another without performance degradation.
arXiv Detail & Related papers (2023-10-26T17:59:46Z) - Relearning Forgotten Knowledge: on Forgetting, Overfit and Training-Free
Ensembles of DNNs [9.010643838773477]
We introduce a novel score for quantifying overfit, which monitors the forgetting rate of deep models on validation data.
We show that overfit can occur with and without a decrease in validation accuracy, and may be more common than previously appreciated.
We use our observations to construct a new ensemble method, based solely on the training history of a single network, which provides significant improvement without any additional cost in training time.
arXiv Detail & Related papers (2023-10-17T09:22:22Z) - Understanding and Mitigating the Label Noise in Pre-training on
Downstream Tasks [91.15120211190519]
This paper aims to understand the nature of noise in pre-training datasets and to mitigate its impact on downstream tasks.
We propose a light-weight black-box tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise.
arXiv Detail & Related papers (2023-09-29T06:18:15Z) - Learning to Unlearn: Instance-wise Unlearning for Pre-trained
Classifiers [71.70205894168039]
We consider instance-wise unlearning, of which the goal is to delete information on a set of instances from a pre-trained model.
We propose two methods that reduce forgetting on the remaining data: 1) utilizing adversarial examples to overcome forgetting at the representation-level and 2) leveraging weight importance metrics to pinpoint network parameters guilty of propagating unwanted information.
arXiv Detail & Related papers (2023-01-27T07:53:50Z) - Reconstructing Training Data from Model Gradient, Provably [68.21082086264555]
We reconstruct the training samples from a single gradient query at a randomly chosen parameter value.
As a provable attack that reveals sensitive training data, our findings suggest potential severe threats to privacy.
arXiv Detail & Related papers (2022-12-07T15:32:22Z) - Conformal prediction for the design problem [72.14982816083297]
In many real-world deployments of machine learning, we use a prediction algorithm to choose what data to test next.
In such settings, there is a distinct type of distribution shift between the training and test data.
We introduce a method to quantify predictive uncertainty in such settings.
arXiv Detail & Related papers (2022-02-08T02:59:12Z) - Trade-offs between membership privacy & adversarially robust learning [13.37805637358556]
We identify settings where standard models will overfit to a larger extent in comparison to robust models.
The degree of overfitting naturally depends on the amount of data available for training.
arXiv Detail & Related papers (2020-06-08T14:20:12Z) - Predicting trends in the quality of state-of-the-art neural networks
without access to training or testing data [46.63168507757103]
We provide a detailed meta-analysis of hundreds of publicly-available pretrained models.
We find that power law based metrics can do much better -- quantitatively better at discriminating among series of well-trained models.
arXiv Detail & Related papers (2020-02-17T00:01:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.