On Robustness and Transferability of Convolutional Neural Networks
- URL: http://arxiv.org/abs/2007.08558v2
- Date: Tue, 23 Mar 2021 16:31:47 GMT
- Title: On Robustness and Transferability of Convolutional Neural Networks
- Authors: Josip Djolonga, Jessica Yung, Michael Tschannen, Rob Romijnders, Lucas
Beyer, Alexander Kolesnikov, Joan Puigcerver, Matthias Minderer, Alexander
D'Amour, Dan Moldovan, Sylvain Gelly, Neil Houlsby, Xiaohua Zhai, Mario Lucic
- Abstract summary: Modern deep convolutional networks (CNNs) are often criticized for not generalizing under distributional shifts.
We study the interplay between out-of-distribution and transfer performance of modern image classification CNNs for the first time.
We find that increasing both the training set and model sizes significantly improve the distributional shift robustness.
- Score: 147.71743081671508
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Modern deep convolutional networks (CNNs) are often criticized for not
generalizing under distributional shifts. However, several recent breakthroughs
in transfer learning suggest that these networks can cope with severe
distribution shifts and successfully adapt to new tasks from a few training
examples. In this work we study the interplay between out-of-distribution and
transfer performance of modern image classification CNNs for the first time and
investigate the impact of the pre-training data size, the model scale, and the
data preprocessing pipeline. We find that increasing both the training set and
model sizes significantly improve the distributional shift robustness.
Furthermore, we show that, perhaps surprisingly, simple changes in the
preprocessing such as modifying the image resolution can significantly mitigate
robustness issues in some cases. Finally, we outline the shortcomings of
existing robustness evaluation datasets and introduce a synthetic dataset
SI-Score we use for a systematic analysis across factors of variation common in
visual data such as object size and position.
Related papers
- Transferable Post-training via Inverse Value Learning [83.75002867411263]
We propose modeling changes at the logits level during post-training using a separate neural network (i.e., the value network)
After training this network on a small base model using demonstrations, this network can be seamlessly integrated with other pre-trained models during inference.
We demonstrate that the resulting value network has broad transferability across pre-trained models of different parameter sizes.
arXiv Detail & Related papers (2024-10-28T13:48:43Z) - An Enhanced Encoder-Decoder Network Architecture for Reducing Information Loss in Image Semantic Segmentation [6.596361762662328]
We introduce an innovative encoder-decoder network structure enhanced with residual connections.
Our approach employs a multi-residual connection strategy designed to preserve the intricate details across various image scales more effectively.
To enhance the convergence rate of network training and mitigate sample imbalance issues, we have devised a modified cross-entropy loss function.
arXiv Detail & Related papers (2024-05-26T05:15:53Z) - Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective [64.04617968947697]
We introduce a novel data-model co-design perspective: to promote superior weight sparsity.
Specifically, customized Visual Prompts are mounted to upgrade neural Network sparsification in our proposed VPNs framework.
arXiv Detail & Related papers (2023-12-03T13:50:24Z) - Entropy-based Guidance of Deep Neural Networks for Accelerated Convergence and Improved Performance [0.8749675983608172]
We derive new mathematical results to measure the changes in entropy as fully-connected and convolutional neural networks process data.
By measuring the change in entropy as networks process data effectively, patterns critical to a well-performing network can be visualized and identified.
Experiments in image compression, image classification, and image segmentation on benchmark datasets demonstrate these losses guide neural networks to learn rich latent data representations in fewer dimensions.
arXiv Detail & Related papers (2023-08-28T23:33:07Z) - Comprehensive Analysis of Network Robustness Evaluation Based on Convolutional Neural Networks with Spatial Pyramid Pooling [4.366824280429597]
Connectivity robustness, a crucial aspect for understanding, optimizing, and repairing complex networks, has traditionally been evaluated through simulations.
We address these challenges by designing a convolutional neural networks (CNN) model with spatial pyramid pooling networks (SPP-net)
We show that the proposed CNN model consistently achieves accurate evaluations of both attack curves and robustness values across all removal scenarios.
arXiv Detail & Related papers (2023-08-10T09:54:22Z) - Solving Large-scale Spatial Problems with Convolutional Neural Networks [88.31876586547848]
We employ transfer learning to improve training efficiency for large-scale spatial problems.
We propose that a convolutional neural network (CNN) can be trained on small windows of signals, but evaluated on arbitrarily large signals with little to no performance degradation.
arXiv Detail & Related papers (2023-06-14T01:24:42Z) - The learning phases in NN: From Fitting the Majority to Fitting a Few [2.5991265608180396]
We analyze a layer's reconstruction ability of the input and prediction performance based on the evolution of parameters during training.
We also assess the behavior using common datasets and architectures from computer vision such as ResNet and VGG.
arXiv Detail & Related papers (2022-02-16T19:11:42Z) - How Well Do Sparse Imagenet Models Transfer? [75.98123173154605]
Transfer learning is a classic paradigm by which models pretrained on large "upstream" datasets are adapted to yield good results on "downstream" datasets.
In this work, we perform an in-depth investigation of this phenomenon in the context of convolutional neural networks (CNNs) trained on the ImageNet dataset.
We show that sparse models can match or even outperform the transfer performance of dense models, even at high sparsities.
arXiv Detail & Related papers (2021-11-26T11:58:51Z) - Analyzing Overfitting under Class Imbalance in Neural Networks for Image
Segmentation [19.259574003403998]
In image segmentation neural networks may overfit to the foreground samples from small structures.
In this study, we provide new insights on the problem of overfitting under class imbalance by inspecting the network behavior.
arXiv Detail & Related papers (2021-02-20T14:57:58Z) - Adversarially-Trained Deep Nets Transfer Better: Illustration on Image
Classification [53.735029033681435]
Transfer learning is a powerful methodology for adapting pre-trained deep neural networks on image recognition tasks to new domains.
In this work, we demonstrate that adversarially-trained models transfer better than non-adversarially-trained models.
arXiv Detail & Related papers (2020-07-11T22:48:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.