Paint Transformer: Feed Forward Neural Painting with Stroke Prediction
- URL: http://arxiv.org/abs/2108.03798v2
- Date: Wed, 11 Aug 2021 13:09:55 GMT
- Title: Paint Transformer: Feed Forward Neural Painting with Stroke Prediction
- Authors: Songhua Liu, Tianwei Lin, Dongliang He, Fu Li, Ruifeng Deng, Xin Li,
Errui Ding, Hao Wang
- Abstract summary: We propose a novel Transformer-based framework, dubbed Paint Transformer, to predict the parameters of a stroke set with a feed forward network.
This way, our model can generate a set of strokes in parallel and obtain the final painting of size 512 * 512 in near real time.
Experiments demonstrate that our method achieves better painting performance than previous ones with cheaper training and inference costs.
- Score: 36.457204758975074
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural painting refers to the procedure of producing a series of strokes for
a given image and non-photo-realistically recreating it using neural networks.
While reinforcement learning (RL) based agents can generate a stroke sequence
step by step for this task, it is not easy to train a stable RL agent. On the
other hand, stroke optimization methods search for a set of stroke parameters
iteratively in a large search space; such low efficiency significantly limits
their prevalence and practicality. Different from previous methods, in this
paper, we formulate the task as a set prediction problem and propose a novel
Transformer-based framework, dubbed Paint Transformer, to predict the
parameters of a stroke set with a feed forward network. This way, our model can
generate a set of strokes in parallel and obtain the final painting of size 512
* 512 in near real time. More importantly, since there is no dataset available
for training the Paint Transformer, we devise a self-training pipeline such
that it can be trained without any off-the-shelf dataset while still achieving
excellent generalization capability. Experiments demonstrate that our method
achieves better painting performance than previous ones with cheaper training
and inference costs. Codes and models are available.
Related papers
- AttentionPainter: An Efficient and Adaptive Stroke Predictor for Scene Painting [82.54770866332456]
Stroke-based Rendering (SBR) aims to decompose an input image into a sequence of parameterized strokes, which can be rendered into a painting that resembles the input image.
We propose AttentionPainter, an efficient and adaptive model for single-step neural painting.
arXiv Detail & Related papers (2024-10-21T18:36:45Z) - MambaPainter: Neural Stroke-Based Rendering in a Single Step [3.18005110016691]
Stroke-based rendering aims to reconstruct an input image into an oil painting style by predicting brush stroke sequences.
We propose MambaPainter, capable of predicting a sequence of over 100 brush strokes in a single inference step, resulting in rapid translation.
arXiv Detail & Related papers (2024-10-16T13:02:45Z) - WavePaint: Resource-efficient Token-mixer for Self-supervised Inpainting [2.3014300466616078]
This paper diverges from vision transformers by using a computationally-efficient WaveMix-based fully convolutional architecture -- WavePaint.
It uses a 2D-discrete wavelet transform (DWT) for spatial and multi-resolution token-mixing along with convolutional layers.
Our model even outperforms current GAN-based architectures in CelebA-HQ dataset without using an adversarially trainable discriminator.
arXiv Detail & Related papers (2023-07-01T18:41:34Z) - Accelerating Multiframe Blind Deconvolution via Deep Learning [0.0]
Ground-based solar image restoration is a computationally expensive procedure.
We propose a new method to accelerate the restoration based on algorithm unrolling.
We show that both methods significantly reduce the restoration time compared to the standard optimization procedure.
arXiv Detail & Related papers (2023-06-21T07:53:00Z) - FastMIM: Expediting Masked Image Modeling Pre-training for Vision [65.47756720190155]
FastMIM is a framework for pre-training vision backbones with low-resolution input images.
It reconstructs Histograms of Oriented Gradients (HOG) feature instead of original RGB values of the input images.
It can achieve 83.8%/84.1% top-1 accuracy on ImageNet-1K with ViT-B/Swin-B as backbones.
arXiv Detail & Related papers (2022-12-13T14:09:32Z) - Learning Prior Feature and Attention Enhanced Image Inpainting [63.21231753407192]
This paper incorporates the pre-training based Masked AutoEncoder (MAE) into the inpainting model.
We propose to use attention priors from MAE to make the inpainting model learn more long-distance dependencies between masked and unmasked regions.
arXiv Detail & Related papers (2022-08-03T04:32:53Z) - Improving Deep Learning Interpretability by Saliency Guided Training [36.782919916001624]
Saliency methods have been widely used to highlight important input features in model predictions.
Most existing methods use backpropagation on a modified gradient function to generate saliency maps.
We introduce a saliency guided training procedure for neural networks to reduce noisy gradients used in predictions.
arXiv Detail & Related papers (2021-11-29T06:05:23Z) - Restormer: Efficient Transformer for High-Resolution Image Restoration [118.9617735769827]
convolutional neural networks (CNNs) perform well at learning generalizable image priors from large-scale data.
Transformers have shown significant performance gains on natural language and high-level vision tasks.
Our model, named Restoration Transformer (Restormer), achieves state-of-the-art results on several image restoration tasks.
arXiv Detail & Related papers (2021-11-18T18:59:10Z) - NP-DRAW: A Non-Parametric Structured Latent Variable Modelfor Image
Generation [139.8037697822064]
We present a non-parametric structured latent variable model for image generation, called NP-DRAW.
It sequentially draws on a latent canvas in a part-by-part fashion and then decodes the image from the canvas.
arXiv Detail & Related papers (2021-06-25T05:17:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.