FrequencyLowCut Pooling -- Plug & Play against Catastrophic Overfitting
- URL: http://arxiv.org/abs/2204.00491v1
- Date: Fri, 1 Apr 2022 14:51:28 GMT
- Title: FrequencyLowCut Pooling -- Plug & Play against Catastrophic Overfitting
- Authors: Julia Grabinski, Steffen Jung, Janis Keuper and Margret Keuper
- Abstract summary: This paper introduces an aliasing free down-sampling operation which can easily be plugged into any CNN architecture.
Our experiments show, that in combination with simple and fast FGSM adversarial training, our hyper- parameter free operator significantly improves model robustness.
- Score: 12.062691258844628
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Over the last years, Convolutional Neural Networks (CNNs) have been the
dominating neural architecture in a wide range of computer vision tasks. From
an image and signal processing point of view, this success might be a bit
surprising as the inherent spatial pyramid design of most CNNs is apparently
violating basic signal processing laws, i.e. Sampling Theorem in their
down-sampling operations. However, since poor sampling appeared not to affect
model accuracy, this issue has been broadly neglected until model robustness
started to receive more attention. Recent work [17] in the context of
adversarial attacks and distribution shifts, showed after all, that there is a
strong correlation between the vulnerability of CNNs and aliasing artifacts
induced by poor down-sampling operations. This paper builds on these findings
and introduces an aliasing free down-sampling operation which can easily be
plugged into any CNN architecture: FrequencyLowCut pooling. Our experiments
show, that in combination with simple and fast FGSM adversarial training, our
hyper-parameter free operator significantly improves model robustness and
avoids catastrophic overfitting.
Related papers
- Transferability of Convolutional Neural Networks in Stationary Learning
Tasks [96.00428692404354]
We introduce a novel framework for efficient training of convolutional neural networks (CNNs) for large-scale spatial problems.
We show that a CNN trained on small windows of such signals achieves a nearly performance on much larger windows without retraining.
Our results show that the CNN is able to tackle problems with many hundreds of agents after being trained with fewer than ten.
arXiv Detail & Related papers (2023-07-21T13:51:45Z) - Fix your downsampling ASAP! Be natively more robust via Aliasing and
Spectral Artifact free Pooling [11.72025865314187]
Convolutional neural networks encode images through a sequence of convolutions, normalizations and non-linearities as well as downsampling operations.
Previous work showed that even slight mistakes during sampling, leading to aliasing, can be directly attributed to the networks' lack in robustness.
We propose aliasing and spectral artifact-free pooling, short ASAP.
arXiv Detail & Related papers (2023-07-19T07:47:23Z) - Solving Large-scale Spatial Problems with Convolutional Neural Networks [88.31876586547848]
We employ transfer learning to improve training efficiency for large-scale spatial problems.
We propose that a convolutional neural network (CNN) can be trained on small windows of signals, but evaluated on arbitrarily large signals with little to no performance degradation.
arXiv Detail & Related papers (2023-06-14T01:24:42Z) - On the effectiveness of partial variance reduction in federated learning
with heterogeneous data [27.527995694042506]
We show that the diversity of the final classification layers across clients impedes the performance of the FedAvg algorithm.
Motivated by this, we propose to correct model by variance reduction only on the final layers.
We demonstrate that this significantly outperforms existing benchmarks at a similar or lower communication cost.
arXiv Detail & Related papers (2022-12-05T11:56:35Z) - Towards Practical Control of Singular Values of Convolutional Layers [65.25070864775793]
Convolutional neural networks (CNNs) are easy to train, but their essential properties, such as generalization error and adversarial robustness, are hard to control.
Recent research demonstrated that singular values of convolutional layers significantly affect such elusive properties.
We offer a principled approach to alleviating constraints of the prior art at the expense of an insignificant reduction in layer expressivity.
arXiv Detail & Related papers (2022-11-24T19:09:44Z) - Benign Overfitting in Two-layer Convolutional Neural Networks [90.75603889605043]
We study the benign overfitting phenomenon in training a two-layer convolutional neural network (CNN)
We show that when the signal-to-noise ratio satisfies a certain condition, a two-layer CNN trained by gradient descent can achieve arbitrarily small training and test loss.
On the other hand, when this condition does not hold, overfitting becomes harmful and the obtained CNN can only achieve constant level test loss.
arXiv Detail & Related papers (2022-02-14T07:45:51Z) - Neural Architecture Dilation for Adversarial Robustness [56.18555072877193]
A shortcoming of convolutional neural networks is that they are vulnerable to adversarial attacks.
This paper aims to improve the adversarial robustness of the backbone CNNs that have a satisfactory accuracy.
Under a minimal computational overhead, a dilation architecture is expected to be friendly with the standard performance of the backbone CNN.
arXiv Detail & Related papers (2021-08-16T03:58:00Z) - Mitigating Performance Saturation in Neural Marked Point Processes:
Architectures and Loss Functions [50.674773358075015]
We propose a simple graph-based network structure called GCHP, which utilizes only graph convolutional layers.
We show that GCHP can significantly reduce training time and the likelihood ratio loss with interarrival time probability assumptions can greatly improve the model performance.
arXiv Detail & Related papers (2021-07-07T16:59:14Z) - How Convolutional Neural Networks Deal with Aliasing [0.0]
We show that an image classifier CNN while, in principle, capable of implementing anti-aliasing filters, does not prevent aliasing from taking place in the intermediate layers.
In the first, we assess the CNNs capability of distinguishing oscillations at the input, showing that the redundancies in the intermediate channels play an important role in succeeding at the task.
In the second, we show that an image classifier CNN while, in principle, capable of implementing anti-aliasing filters, does not prevent aliasing from taking place in the intermediate layers.
arXiv Detail & Related papers (2021-02-15T18:52:47Z) - Generating Black-Box Adversarial Examples in Sparse Domain [2.879036956042183]
Black-box adversarial attack is one type of attack that the attacker does not have any knowledge about the model or the training dataset.
We propose a novel approach to generate a black-box attack in sparse domain whereas the most important information of an image can be observed.
arXiv Detail & Related papers (2021-01-22T20:45:33Z) - When to Use Convolutional Neural Networks for Inverse Problems [40.60063929073102]
We show how a convolutional neural network can be viewed as an approximate solution to a convolutional sparse coding problem.
We argue that for some types of inverse problems the CNN approximation breaks down leading to poor performance.
Specifically we identify JPEG artifact reduction and non-rigid trajectory reconstruction as challenging inverse problems for CNNs.
arXiv Detail & Related papers (2020-03-30T21:08:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.