AesFA: An Aesthetic Feature-Aware Arbitrary Neural Style Transfer
- URL: http://arxiv.org/abs/2312.05928v3
- Date: Thu, 22 Feb 2024 18:57:44 GMT
- Title: AesFA: An Aesthetic Feature-Aware Arbitrary Neural Style Transfer
- Authors: Joonwoo Kwon, Sooyoung Kim, Yuewei Lin, Shinjae Yoo, Jiook Cha
- Abstract summary: This work proposes a lightweight but effective model, AesFA -- Aesthetic Feature-Aware NST.
The primary idea is to decompose the image via its frequencies to better disentangle aesthetic styles from the reference image.
To improve the network's ability to extract more distinct representations, this work introduces a new aesthetic feature: contrastive loss.
- Score: 6.518925259025401
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neural style transfer (NST) has evolved significantly in recent years. Yet,
despite its rapid progress and advancement, existing NST methods either
struggle to transfer aesthetic information from a style effectively or suffer
from high computational costs and inefficiencies in feature disentanglement due
to using pre-trained models. This work proposes a lightweight but effective
model, AesFA -- Aesthetic Feature-Aware NST. The primary idea is to decompose
the image via its frequencies to better disentangle aesthetic styles from the
reference image while training the entire model in an end-to-end manner to
exclude pre-trained models at inference completely. To improve the network's
ability to extract more distinct representations and further enhance the
stylization quality, this work introduces a new aesthetic feature: contrastive
loss. Extensive experiments and ablations show the approach not only
outperforms recent NST methods in terms of stylization quality, but it also
achieves faster inference. Codes are available at
https://github.com/Sooyyoungg/AesFA.
Related papers
- Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think [72.48325960659822]
One main bottleneck in training large-scale diffusion models for generation lies in effectively learning these representations.
We study this by introducing a straightforward regularization called REPresentation Alignment (REPA), which aligns the projections of noisy input hidden states in denoising networks with clean image representations obtained from external, pretrained visual encoders.
The results are striking: our simple strategy yields significant improvements in both training efficiency and generation quality when applied to popular diffusion and flow-based transformers, such as DiTs and SiTs.
arXiv Detail & Related papers (2024-10-09T14:34:53Z) - Intra-task Mutual Attention based Vision Transformer for Few-Shot Learning [12.5354658533836]
Humans possess remarkable ability to accurately classify new, unseen images after being exposed to only a few examples.
For artificial neural network models, determining the most relevant features for distinguishing between two images with limited samples presents a challenge.
We propose an intra-task mutual attention method for few-shot learning, that involves splitting the support and query samples into patches.
arXiv Detail & Related papers (2024-05-06T02:02:57Z) - Controlling Neural Style Transfer with Deep Reinforcement Learning [55.480819498109746]
We propose the first deep Reinforcement Learning based architecture that splits one-step style transfer into a step-wise process.
Our method tends to preserve more details and structures of the content image in early steps, and synthesize more style patterns in later steps.
arXiv Detail & Related papers (2023-09-30T15:01:02Z) - WSAM: Visual Explanations from Style Augmentation as Adversarial
Attacker and Their Influence in Image Classification [2.282270386262498]
This paper outlines a style augmentation algorithm using noise-based sampling with addition to improving randomization on a general linear transformation for style transfer.
All models not only present incredible robustness against image stylizing but also outperform all previous methods and surpass the state-of-the-art performance for the STL-10 dataset.
arXiv Detail & Related papers (2023-08-29T02:50:36Z) - DIFF-NST: Diffusion Interleaving For deFormable Neural Style Transfer [27.39248034592382]
We propose using a new class of models to perform style transfer while enabling deformable style transfer.
We show how leveraging the priors of these models can expose new artistic controls at inference time.
arXiv Detail & Related papers (2023-07-09T12:13:43Z) - Contrastive Learning Rivals Masked Image Modeling in Fine-tuning via
Feature Distillation [42.37533586611174]
Masked image modeling (MIM) learns representations with remarkably good fine-tuning performances.
In this paper, we show that the inferior fine-tuning performance of pre-training approaches can be significantly improved by a simple post-processing.
arXiv Detail & Related papers (2022-05-27T17:59:36Z) - FOSTER: Feature Boosting and Compression for Class-Incremental Learning [52.603520403933985]
Deep neural networks suffer from catastrophic forgetting when learning new categories.
We propose a novel two-stage learning paradigm FOSTER, empowering the model to learn new categories adaptively.
arXiv Detail & Related papers (2022-04-10T11:38:33Z) - Learning Rich Nearest Neighbor Representations from Self-supervised
Ensembles [60.97922557957857]
We provide a framework to perform self-supervised model ensembling via a novel method of learning representations directly through gradient descent at inference time.
This technique improves representation quality, as measured by k-nearest neighbors, both on the in-domain dataset and in the transfer setting.
arXiv Detail & Related papers (2021-10-19T22:24:57Z) - Powerpropagation: A sparsity inducing weight reparameterisation [65.85142037667065]
We introduce Powerpropagation, a new weight- parameterisation for neural networks that leads to inherently sparse models.
Models trained in this manner exhibit similar performance, but have a distribution with markedly higher density at zero, allowing more parameters to be pruned safely.
Here, we combine Powerpropagation with a traditional weight-pruning technique as well as recent state-of-the-art sparse-to-sparse algorithms, showing superior performance on the ImageNet benchmark.
arXiv Detail & Related papers (2021-10-01T10:03:57Z) - Unleashing the Power of Contrastive Self-Supervised Visual Models via
Contrast-Regularized Fine-Tuning [94.35586521144117]
We investigate whether applying contrastive learning to fine-tuning would bring further benefits.
We propose Contrast-regularized tuning (Core-tuning), a novel approach for fine-tuning contrastive self-supervised visual models.
arXiv Detail & Related papers (2021-02-12T16:31:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.