How Well Do Sparse Imagenet Models Transfer?
- URL: http://arxiv.org/abs/2111.13445v1
- Date: Fri, 26 Nov 2021 11:58:51 GMT
- Title: How Well Do Sparse Imagenet Models Transfer?
- Authors: Eugenia Iofinova and Alexandra Peste and Mark Kurtz and Dan Alistarh
- Abstract summary: Transfer learning is a classic paradigm by which models pretrained on large "upstream" datasets are adapted to yield good results on "downstream" datasets.
In this work, we perform an in-depth investigation of this phenomenon in the context of convolutional neural networks (CNNs) trained on the ImageNet dataset.
We show that sparse models can match or even outperform the transfer performance of dense models, even at high sparsities.
- Score: 75.98123173154605
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Transfer learning is a classic paradigm by which models pretrained on large
"upstream" datasets are adapted to yield good results on "downstream,"
specialized datasets. Generally, it is understood that more accurate models on
the "upstream" dataset will provide better transfer accuracy "downstream". In
this work, we perform an in-depth investigation of this phenomenon in the
context of convolutional neural networks (CNNs) trained on the ImageNet
dataset, which have been pruned - that is, compressed by sparsifiying their
connections. Specifically, we consider transfer using unstructured pruned
models obtained by applying several state-of-the-art pruning methods, including
magnitude-based, second-order, re-growth and regularization approaches, in the
context of twelve standard transfer tasks. In a nutshell, our study shows that
sparse models can match or even outperform the transfer performance of dense
models, even at high sparsities, and, while doing so, can lead to significant
inference and even training speedups. At the same time, we observe and analyze
significant differences in the behaviour of different pruning methods.
Related papers
- Fantastic Gains and Where to Find Them: On the Existence and Prospect of
General Knowledge Transfer between Any Pretrained Model [74.62272538148245]
We show that for arbitrary pairings of pretrained models, one model extracts significant data context unavailable in the other.
We investigate if it is possible to transfer such "complementary" knowledge from one model to another without performance degradation.
arXiv Detail & Related papers (2023-10-26T17:59:46Z) - Semantic-Fused Multi-Granularity Cross-City Traffic Prediction [17.020546413647708]
We propose a Semantic-Fused Multi-Granularity Transfer Learning model to achieve knowledge transfer across cities with fused semantics at different granularities.
In detail, we design a semantic fusion module to fuse various semantics while conserving static spatial dependencies.
We conduct extensive experiments on six real-world datasets to verify the effectiveness of our STL model.
arXiv Detail & Related papers (2023-02-23T04:26:34Z) - Beyond Transfer Learning: Co-finetuning for Action Localisation [64.07196901012153]
We propose co-finetuning -- simultaneously training a single model on multiple upstream'' and downstream'' tasks.
We demonstrate that co-finetuning outperforms traditional transfer learning when using the same total amount of data.
We also show how we can easily extend our approach to multiple upstream'' datasets to further improve performance.
arXiv Detail & Related papers (2022-07-08T10:25:47Z) - On Robustness and Transferability of Convolutional Neural Networks [147.71743081671508]
Modern deep convolutional networks (CNNs) are often criticized for not generalizing under distributional shifts.
We study the interplay between out-of-distribution and transfer performance of modern image classification CNNs for the first time.
We find that increasing both the training set and model sizes significantly improve the distributional shift robustness.
arXiv Detail & Related papers (2020-07-16T18:39:04Z) - Do Adversarially Robust ImageNet Models Transfer Better? [102.09335596483695]
adversarially robust models often perform better than their standard-trained counterparts when used for transfer learning.
Our results are consistent with (and in fact, add to) recent hypotheses stating that robustness leads to improved feature representations.
arXiv Detail & Related papers (2020-07-16T17:42:40Z) - Adversarially-Trained Deep Nets Transfer Better: Illustration on Image
Classification [53.735029033681435]
Transfer learning is a powerful methodology for adapting pre-trained deep neural networks on image recognition tasks to new domains.
In this work, we demonstrate that adversarially-trained models transfer better than non-adversarially-trained models.
arXiv Detail & Related papers (2020-07-11T22:48:42Z) - Conditional Mutual information-based Contrastive Loss for Financial Time
Series Forecasting [12.0855096102517]
We present a representation learning framework for financial time series forecasting.
In this paper, we propose to first learn compact representations from time series data, then use the learned representations to train a simpler model for predicting time series movements.
arXiv Detail & Related papers (2020-02-18T15:24:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.