Controlling Neural Style Transfer with Deep Reinforcement Learning
- URL: http://arxiv.org/abs/2310.00405v1
- Date: Sat, 30 Sep 2023 15:01:02 GMT
- Title: Controlling Neural Style Transfer with Deep Reinforcement Learning
- Authors: Chengming Feng, Jing Hu, Xin Wang, Shu Hu, Bin Zhu, Xi Wu, Hongtu Zhu
and Siwei Lyu
- Abstract summary: We propose the first deep Reinforcement Learning based architecture that splits one-step style transfer into a step-wise process.
Our method tends to preserve more details and structures of the content image in early steps, and synthesize more style patterns in later steps.
- Score: 55.480819498109746
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Controlling the degree of stylization in the Neural Style Transfer (NST) is a
little tricky since it usually needs hand-engineering on hyper-parameters. In
this paper, we propose the first deep Reinforcement Learning (RL) based
architecture that splits one-step style transfer into a step-wise process for
the NST task. Our RL-based method tends to preserve more details and structures
of the content image in early steps, and synthesize more style patterns in
later steps. It is a user-easily-controlled style-transfer method.
Additionally, as our RL-based model performs the stylization progressively, it
is lightweight and has lower computational complexity than existing one-step
Deep Learning (DL) based models. Experimental results demonstrate the
effectiveness and robustness of our method.
Related papers
- LESA: Learnable LLM Layer Scaling-Up [57.0510934286449]
Training Large Language Models (LLMs) from scratch requires immense computational resources, making it prohibitively expensive.
Model scaling-up offers a promising solution by leveraging the parameters of smaller models to create larger ones.
We propose textbfLESA, a novel learnable method for depth scaling-up.
arXiv Detail & Related papers (2025-02-19T14:58:48Z) - StyleRWKV: High-Quality and High-Efficiency Style Transfer with RWKV-like Architecture [29.178246094092202]
Style transfer aims to generate a new image preserving the content but with the artistic representation of the style source.
Most of the existing methods are based on Transformers or diffusion models, however, they suffer from quadratic computational complexity and high inference time.
We present a novel framework StyleRWKV, to achieve high-quality style transfer with limited memory usage and linear time complexity.
arXiv Detail & Related papers (2024-12-27T09:01:15Z) - STAR: Synthesis of Tailored Architectures [61.080157488857516]
We propose a new approach for the synthesis of tailored architectures (STAR)
Our approach combines a novel search space based on the theory of linear input-varying systems, supporting a hierarchical numerical encoding into architecture genomes. STAR genomes are automatically refined and recombined with gradient-free, evolutionary algorithms to optimize for multiple model quality and efficiency metrics.
Using STAR, we optimize large populations of new architectures, leveraging diverse computational units and interconnection patterns, improving over highly-optimized Transformers and striped hybrid models on the frontier of quality, parameter size, and inference cache for autoregressive language modeling.
arXiv Detail & Related papers (2024-11-26T18:42:42Z) - Q-SFT: Q-Learning for Language Models via Supervised Fine-Tuning [62.984693936073974]
Value-based reinforcement learning can learn effective policies for a wide range of multi-turn problems.
Current value-based RL methods have proven particularly challenging to scale to the setting of large language models.
We propose a novel offline RL algorithm that addresses these drawbacks, casting Q-learning as a modified supervised fine-tuning problem.
arXiv Detail & Related papers (2024-11-07T21:36:52Z) - Advancing Neural Network Performance through Emergence-Promoting Initialization Scheme [0.0]
Emergence in machine learning refers to the spontaneous appearance of capabilities that arise from the scale and structure of training data.
We introduce a novel yet straightforward neural network initialization scheme that aims at achieving greater potential for emergence.
We demonstrate substantial improvements in both model accuracy and training speed, with and without batch normalization.
arXiv Detail & Related papers (2024-07-26T18:56:47Z) - HiCAST: Highly Customized Arbitrary Style Transfer with Adapter Enhanced
Diffusion Models [84.12784265734238]
The goal of Arbitrary Style Transfer (AST) is injecting the artistic features of a style reference into a given image/video.
We propose HiCAST, which is capable of explicitly customizing the stylization results according to various source of semantic clues.
A novel learning objective is leveraged for video diffusion model training, which significantly improve cross-frame temporal consistency.
arXiv Detail & Related papers (2024-01-11T12:26:23Z) - WSAM: Visual Explanations from Style Augmentation as Adversarial
Attacker and Their Influence in Image Classification [2.282270386262498]
This paper outlines a style augmentation algorithm using noise-based sampling with addition to improving randomization on a general linear transformation for style transfer.
All models not only present incredible robustness against image stylizing but also outperform all previous methods and surpass the state-of-the-art performance for the STL-10 dataset.
arXiv Detail & Related papers (2023-08-29T02:50:36Z) - Deep Active Learning with Structured Neural Depth Search [18.180995603975422]
Active-iNAS trains several models and selects the model with the best generalization performance for querying the subsequent samples after each active learning cycle.
We propose a novel active strategy with the method called structured variational inference (SVI) or structured neural depth search (SNDS)
At the same time, we theoretically demonstrate that the current VI-based methods based on the mean-field assumption could lead to poor performance.
arXiv Detail & Related papers (2023-06-05T12:00:12Z) - A Unified Arbitrary Style Transfer Framework via Adaptive Contrastive
Learning [84.8813842101747]
Unified Contrastive Arbitrary Style Transfer (UCAST) is a novel style representation learning and transfer framework.
We present an adaptive contrastive learning scheme for style transfer by introducing an input-dependent temperature.
Our framework consists of three key components, i.e., a parallel contrastive learning scheme for style representation and style transfer, a domain enhancement module for effective learning of style distribution, and a generative network for style transfer.
arXiv Detail & Related papers (2023-03-09T04:35:00Z) - Deep Convolutional Transform Learning -- Extended version [31.034188573071898]
This work introduces a new unsupervised representation learning technique called Deep Convolutional Transform Learning (DCTL)
By stacking convolutional transforms, our approach is able to learn a set of independent kernels at different layers.
The features extracted in an unsupervised manner can then be used to perform machine learning tasks, such as classification and clustering.
arXiv Detail & Related papers (2020-10-02T14:03:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.