Paying U-Attention to Textures: Multi-Stage Hourglass Vision Transformer
for Universal Texture Synthesis
- URL: http://arxiv.org/abs/2202.11703v1
- Date: Wed, 23 Feb 2022 18:58:56 GMT
- Title: Paying U-Attention to Textures: Multi-Stage Hourglass Vision Transformer
for Universal Texture Synthesis
- Authors: Shouchang Guo, Valentin Deschaintre, Douglas Noll, Arthur Roullier
- Abstract summary: We present a novel U-Attention vision Transformer for universal texture synthesis.
We exploit the natural long-range dependencies enabled by the attention mechanism to allow our approach to synthesize diverse textures.
- Score: 3.441021278275805
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a novel U-Attention vision Transformer for universal texture
synthesis. We exploit the natural long-range dependencies enabled by the
attention mechanism to allow our approach to synthesize diverse textures while
preserving their structures in a single inference. We propose a multi-stage
hourglass backbone that attends to the global structure and performs patch
mapping at varying scales in a coarse-to-fine-to-coarse stream. Further
completed by skip connection and convolution designs that propagate and fuse
information at different scales, our U-Attention architecture unifies attention
to microstructures, mesostructures and macrostructures, and progressively
refines synthesis results at successive stages. We show that our method
achieves stronger 2$\times$ synthesis than previous work on both stochastic and
structured textures while generalizing to unseen textures without fine-tuning.
Ablation studies demonstrate the effectiveness of each component of our
architecture.
Related papers
- Learning Correlation Structures for Vision Transformers [93.22434535223587]
We introduce a new attention mechanism, dubbed structural self-attention (StructSA)
We generate attention maps by recognizing space-time structures of key-query correlations via convolution.
This effectively leverages rich structural patterns in images and videos such as scene layouts, object motion, and inter-object relations.
arXiv Detail & Related papers (2024-04-05T07:13:28Z) - Generating Non-Stationary Textures using Self-Rectification [70.91414475376698]
This paper addresses the challenge of example-based non-stationary texture synthesis.
We introduce a novel twostep approach wherein users first modify a reference texture using standard image editing tools.
Our proposed method, termed "self-rectification", automatically refines this target into a coherent, seamless texture.
arXiv Detail & Related papers (2024-01-05T15:07:05Z) - Pyramid Texture Filtering [86.15126028139736]
We present a simple but effective technique to smooth out textures while preserving the prominent structures.
Our method is built upon a key observation -- the coarsest level in a Gaussian pyramid often naturally eliminates textures and summarizes the main image structures.
We show that our approach is effective to separate structure from texture of different scales, local contrasts, and forms, without degrading structures or introducing visual artifacts.
arXiv Detail & Related papers (2023-05-11T02:05:30Z) - A geometrically aware auto-encoder for multi-texture synthesis [1.2891210250935146]
We propose an auto-encoder architecture for multi-texture synthesis.
Images are embedded in a compact and geometrically consistent latent space.
Texture synthesis and tasks can be performed directly from these latent codes.
arXiv Detail & Related papers (2023-02-03T09:28:39Z) - Towards Universal Texture Synthesis by Combining Texton Broadcasting
with Noise Injection in StyleGAN-2 [11.67779950826776]
We present a new approach for universal texture synthesis incorporating by a multi-scale texton broadcasting module in the StyleGAN-2 framework.
The texton broadcasting module introduces an inductive bias, enabling generation of broader range of textures from those with regular structures to completely ones.
arXiv Detail & Related papers (2022-03-08T17:44:35Z) - Texture Reformer: Towards Fast and Universal Interactive Texture
Transfer [16.41438144343516]
texture reformer is a neural-based framework for interactive texture transfer with user-specified guidance.
We introduce a novel learning-free view-specific texture reformation (VSTR) operation with a new semantic map guidance strategy.
The experimental results on a variety of application scenarios demonstrate the effectiveness and superiority of our framework.
arXiv Detail & Related papers (2021-12-06T05:20:43Z) - Dynamic Texture Synthesis by Incorporating Long-range Spatial and
Temporal Correlations [27.247382497265214]
We introduce a new loss term, called the Shifted Gram loss, to capture the structural and long-range correlation of the reference texture video.
We also introduce a frame sampling strategy to exploit long-period motion across multiple frames.
arXiv Detail & Related papers (2021-04-13T05:04:51Z) - Region-adaptive Texture Enhancement for Detailed Person Image Synthesis [86.69934638569815]
RATE-Net is a novel framework for synthesizing person images with sharp texture details.
The proposed framework leverages an additional texture enhancing module to extract appearance information from the source image.
Experiments conducted on DeepFashion benchmark dataset have demonstrated the superiority of our framework compared with existing networks.
arXiv Detail & Related papers (2020-05-26T02:33:21Z) - Co-occurrence Based Texture Synthesis [25.4878061402506]
We propose a fully convolutional generative adversarial network, conditioned locally on co-occurrence statistics, to generate arbitrarily large images.
We show that our solution offers a stable, intuitive and interpretable latent representation for texture synthesis.
arXiv Detail & Related papers (2020-05-17T08:01:44Z) - Towards Analysis-friendly Face Representation with Scalable Feature and
Texture Compression [113.30411004622508]
We show that a universal and collaborative visual information representation can be achieved in a hierarchical way.
Based on the strong generative capability of deep neural networks, the gap between the base feature layer and enhancement layer is further filled with the feature level texture reconstruction.
To improve the efficiency of the proposed framework, the base layer neural network is trained in a multi-task manner.
arXiv Detail & Related papers (2020-04-21T14:32:49Z) - Hierarchy Composition GAN for High-fidelity Image Synthesis [57.32311953820988]
This paper presents an innovative Hierarchical Composition GAN (HIC-GAN)
HIC-GAN incorporates image synthesis in geometry and appearance domains into an end-to-end trainable network.
Experiments on scene text image synthesis, portrait editing and indoor rendering tasks show that the proposed HIC-GAN achieves superior synthesis performance qualitatively and quantitatively.
arXiv Detail & Related papers (2019-05-12T11:11:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.