Measuring Overfitting in Convolutional Neural Networks using Adversarial
Perturbations and Label Noise
- URL: http://arxiv.org/abs/2209.13382v1
- Date: Tue, 27 Sep 2022 13:40:53 GMT
- Title: Measuring Overfitting in Convolutional Neural Networks using Adversarial
Perturbations and Label Noise
- Authors: Svetlana Pavlitskaya, Jo\"el Oswald and J.Marius Z\"ollner
- Abstract summary: Overfitted neural networks tend to rather memorize noise in the training data than generalize to unseen data.
We introduce several anti-overfitting measures in architectures based on VGG and ResNet.
We assess the applicability of the proposed metrics by measuring the overfitting degree of several CNN architectures outside of our model pool.
- Score: 3.395452700023097
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although numerous methods to reduce the overfitting of convolutional neural
networks (CNNs) exist, it is still not clear how to confidently measure the
degree of overfitting. A metric reflecting the overfitting level might be,
however, extremely helpful for the comparison of different architectures and
for the evaluation of various techniques to tackle overfitting. Motivated by
the fact that overfitted neural networks tend to rather memorize noise in the
training data than generalize to unseen data, we examine how the training
accuracy changes in the presence of increasing data perturbations and study the
connection to overfitting. While previous work focused on label noise only, we
examine a spectrum of techniques to inject noise into the training data,
including adversarial perturbations and input corruptions. Based on this, we
define two new metrics that can confidently distinguish between correct and
overfitted models. For the evaluation, we derive a pool of models for which the
overfitting behavior is known beforehand. To test the effect of various
factors, we introduce several anti-overfitting measures in architectures based
on VGG and ResNet and study their impact, including regularization techniques,
training set size, and the number of parameters. Finally, we assess the
applicability of the proposed metrics by measuring the overfitting degree of
several CNN architectures outside of our model pool.
Related papers
- Uncertainty estimation via ensembles of deep learning models and dropout layers for seismic traces [27.619194576741673]
We develop Convolutional Neural Networks (CNNs) to classify seismic waveforms based on first-motion polarity.
We constructed ensembles of networks to estimate uncertainty.
We observe that the uncertainty estimation ability of the ensembles of networks can be enhanced using dropout layers.
arXiv Detail & Related papers (2024-10-08T15:22:15Z) - Systematic Evaluation of Synthetic Data Augmentation for Multi-class NetFlow Traffic [2.5182419298876857]
Multi-class classification models can identify specific types of attacks, allowing for more targeted and effective incident responses.
Recent advances suggest that generative models can assist in data augmentation, claiming to offer superior solutions for imbalanced datasets.
Our experiments indicate that resampling methods for balancing training data do not reliably improve classification performance.
arXiv Detail & Related papers (2024-08-28T12:44:07Z) - On the Condition Monitoring of Bolted Joints through Acoustic Emission and Deep Transfer Learning: Generalization, Ordinal Loss and Super-Convergence [0.12289361708127876]
This paper investigates the use of deep transfer learning based on convolutional neural networks (CNNs) to monitor bolted joints using acoustic emissions.
We evaluate the performance of our methodology using the ORION-AE benchmark, a structure composed of two thin beams connected by three bolts.
arXiv Detail & Related papers (2024-05-29T13:07:21Z) - Learning with Noisy Foundation Models [95.50968225050012]
This paper is the first work to comprehensively understand and analyze the nature of noise in pre-training datasets.
We propose a tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise and improve generalization.
arXiv Detail & Related papers (2024-03-11T16:22:41Z) - Enhancing Multiple Reliability Measures via Nuisance-extended
Information Bottleneck [77.37409441129995]
In practical scenarios where training data is limited, many predictive signals in the data can be rather from some biases in data acquisition.
We consider an adversarial threat model under a mutual information constraint to cover a wider class of perturbations in training.
We propose an autoencoder-based training to implement the objective, as well as practical encoder designs to facilitate the proposed hybrid discriminative-generative training.
arXiv Detail & Related papers (2023-03-24T16:03:21Z) - Pre-training via Denoising for Molecular Property Prediction [53.409242538744444]
We describe a pre-training technique that utilizes large datasets of 3D molecular structures at equilibrium.
Inspired by recent advances in noise regularization, our pre-training objective is based on denoising.
arXiv Detail & Related papers (2022-05-31T22:28:34Z) - Improving robustness of jet tagging algorithms with adversarial training [56.79800815519762]
We investigate the vulnerability of flavor tagging algorithms via application of adversarial attacks.
We present an adversarial training strategy that mitigates the impact of such simulated attacks.
arXiv Detail & Related papers (2022-03-25T19:57:19Z) - On the benefits of robust models in modulation recognition [53.391095789289736]
Deep Neural Networks (DNNs) using convolutional layers are state-of-the-art in many tasks in communications.
In other domains, like image classification, DNNs have been shown to be vulnerable to adversarial perturbations.
We propose a novel framework to test the robustness of current state-of-the-art models.
arXiv Detail & Related papers (2021-03-27T19:58:06Z) - Attribute-Guided Adversarial Training for Robustness to Natural
Perturbations [64.35805267250682]
We propose an adversarial training approach which learns to generate new samples so as to maximize exposure of the classifier to the attributes-space.
Our approach enables deep neural networks to be robust against a wide range of naturally occurring perturbations.
arXiv Detail & Related papers (2020-12-03T10:17:30Z) - Solving Inverse Problems With Deep Neural Networks -- Robustness
Included? [3.867363075280544]
Recent works have pointed out instabilities of deep neural networks for several image reconstruction tasks.
In analogy to adversarial attacks in classification, it was shown that slight distortions in the input domain may cause severe artifacts.
This article sheds new light on this concern, by conducting an extensive study of the robustness of deep-learning-based algorithms for solving underdetermined inverse problems.
arXiv Detail & Related papers (2020-11-09T09:33:07Z) - Ramifications of Approximate Posterior Inference for Bayesian Deep
Learning in Adversarial and Out-of-Distribution Settings [7.476901945542385]
We show that Bayesian deep learning models on certain occasions marginally outperform conventional neural networks.
Preliminary investigations indicate the potential inherent role of bias due to choices of initialisation, architecture or activation functions.
arXiv Detail & Related papers (2020-09-03T16:58:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.