Playing Lottery Tickets in Style Transfer Models
- URL: http://arxiv.org/abs/2203.13802v1
- Date: Fri, 25 Mar 2022 17:43:18 GMT
- Title: Playing Lottery Tickets in Style Transfer Models
- Authors: Meihao Kong, Jing Huo, Wenbin Li, Jing Wu, Yu-Kun Lai, Yang Gao
- Abstract summary: Style transfer has achieved great success and attracted a wide range of attention from both academic and industrial communities.
However, the dependence on pretty large VGG based autoencoder leads to existing style transfer models having a high parameter complexities.
In this work, we perform the first empirical study to verify whether such trainable networks also exist in style transfer models.
- Score: 57.55795986289975
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Style transfer has achieved great success and attracted a wide range of
attention from both academic and industrial communities due to its flexible
application scenarios. However, the dependence on pretty large VGG based
autoencoder leads to existing style transfer models have a high parameter
complexities which limits the application for resource-constrained devices.
Unfortunately, the compression of style transfer model has less been explored.
In parallel, study on the lottery ticket hypothesis (LTH) has shown great
potential in finding extremely sparse matching subnetworks which can achieve on
par or even better performance than original full networks when trained in
isolation. In this work, we perform the first empirical study to verify whether
such trainable networks also exist in style transfer models. From a wide range
of style transfer methods, we choose two of the most popular style transfer
models as the main testbeds, i.e., AdaIN and SANet, representing approaches of
global and local transformation based style transfer respectively. Through
extensive experiments and comprehensive analysis, we draw the following main
conclusions. (1) Compared with fixing VGG encoder, style transfer models can
benefit more from training the whole network together. (2) Using iterative
magnitude pruning, we find the most sparse matching subnetworks at 89.2% in
AdaIN and 73.7% in SANet, which suggests that style transfer models can play
lottery tickets too. (3) Feature transformation module should also be pruned to
get a sparser model without affecting the existence and quality of matching
subnetworks. (4) Besides AdaIN and SANet, other models such as LST, MANet,
AdaAttN and MCCNet can also play lottert tickets, which shows that LTH can be
generalized to various style transfer models.
Related papers
- Fantastic Gains and Where to Find Them: On the Existence and Prospect of
General Knowledge Transfer between Any Pretrained Model [74.62272538148245]
We show that for arbitrary pairings of pretrained models, one model extracts significant data context unavailable in the other.
We investigate if it is possible to transfer such "complementary" knowledge from one model to another without performance degradation.
arXiv Detail & Related papers (2023-10-26T17:59:46Z) - FISTNet: FusIon of STyle-path generative Networks for Facial Style Transfer [15.308837341075135]
StyleGAN methods have the tendency of overfitting that results in the introduction of artifacts in the facial images.
We propose a FusIon of STyles (FIST) network for facial images that leverages pre-trained multipath style transfer networks.
arXiv Detail & Related papers (2023-07-18T07:20:31Z) - Master: Meta Style Transformer for Controllable Zero-Shot and Few-Shot
Artistic Style Transfer [83.1333306079676]
In this paper, we devise a novel Transformer model termed as emphMaster specifically for style transfer.
In the proposed model, different Transformer layers share a common group of parameters, which (1) reduces the total number of parameters, (2) leads to more robust training convergence, and (3) is readily to control the degree of stylization.
Experiments demonstrate the superiority of Master under both zero-shot and few-shot style transfer settings.
arXiv Detail & Related papers (2023-04-24T04:46:39Z) - A Unified Arbitrary Style Transfer Framework via Adaptive Contrastive
Learning [84.8813842101747]
Unified Contrastive Arbitrary Style Transfer (UCAST) is a novel style representation learning and transfer framework.
We present an adaptive contrastive learning scheme for style transfer by introducing an input-dependent temperature.
Our framework consists of three key components, i.e., a parallel contrastive learning scheme for style representation and style transfer, a domain enhancement module for effective learning of style distribution, and a generative network for style transfer.
arXiv Detail & Related papers (2023-03-09T04:35:00Z) - On Optimizing the Communication of Model Parallelism [79.33873698640662]
We study a novel and important communication pattern in large-scale model-parallel deep learning (DL)
In cross-mesh resharding, a sharded tensor needs to be sent from a source device mesh to a destination device mesh.
We propose two contributions to address cross-mesh resharding: an efficient broadcast-based communication system, and an "overlapping-friendly" pipeline schedule.
arXiv Detail & Related papers (2022-11-10T03:56:48Z) - How Well Do Sparse Imagenet Models Transfer? [75.98123173154605]
Transfer learning is a classic paradigm by which models pretrained on large "upstream" datasets are adapted to yield good results on "downstream" datasets.
In this work, we perform an in-depth investigation of this phenomenon in the context of convolutional neural networks (CNNs) trained on the ImageNet dataset.
We show that sparse models can match or even outperform the transfer performance of dense models, even at high sparsities.
arXiv Detail & Related papers (2021-11-26T11:58:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.