Neural Texture Extraction and Distribution for Controllable Person Image
Synthesis
- URL: http://arxiv.org/abs/2204.06160v1
- Date: Wed, 13 Apr 2022 03:51:07 GMT
- Title: Neural Texture Extraction and Distribution for Controllable Person Image
Synthesis
- Authors: Yurui Ren, Xiaoqing Fan, Ge Li, Shan Liu, Thomas H. Li
- Abstract summary: We deal with controllable person image synthesis task which aims to re-render a human from a reference image with explicit control over body pose and appearance.
Observing that person images are highly structured, we propose to generate desired images by extracting and distributing semantic entities of reference images.
- Score: 46.570170624026595
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We deal with the controllable person image synthesis task which aims to
re-render a human from a reference image with explicit control over body pose
and appearance. Observing that person images are highly structured, we propose
to generate desired images by extracting and distributing semantic entities of
reference images. To achieve this goal, a neural texture extraction and
distribution operation based on double attention is described. This operation
first extracts semantic neural textures from reference feature maps. Then, it
distributes the extracted neural textures according to the spatial
distributions learned from target poses. Our model is trained to predict human
images in arbitrary poses, which encourages it to extract disentangled and
expressive neural textures representing the appearance of different semantic
entities. The disentangled representation further enables explicit appearance
control. Neural textures of different reference images can be fused to control
the appearance of the interested areas. Experimental comparisons show the
superiority of the proposed model. Code is available at
https://github.com/RenYurui/Neural-Texture-Extraction-Distribution.
Related papers
- Label-free Neural Semantic Image Synthesis [12.194020204848492]
We introduce the concept of neural semantic image synthesis, which uses neural layouts extracted from pre-trained foundation models as conditioning.
We experimentally show that images synthesized via neural semantic image synthesis achieve similar or superior pixel-level alignment of semantic classes.
We show that images generated by neural layout conditioning can effectively augment real data for training various perception tasks.
arXiv Detail & Related papers (2024-07-01T20:30:23Z) - Seeing in Words: Learning to Classify through Language Bottlenecks [59.97827889540685]
Humans can explain their predictions using succinct and intuitive descriptions.
We show that a vision model whose feature representations are text can effectively classify ImageNet images.
arXiv Detail & Related papers (2023-06-29T00:24:42Z) - Novel View Synthesis of Humans using Differentiable Rendering [50.57718384229912]
We present a new approach for synthesizing novel views of people in new poses.
Our synthesis makes use of diffuse Gaussian primitives that represent the underlying skeletal structure of a human.
Rendering these primitives gives results in a high-dimensional latent image, which is then transformed into an RGB image by a decoder network.
arXiv Detail & Related papers (2023-03-28T10:48:33Z) - Neural Novel Actor: Learning a Generalized Animatable Neural
Representation for Human Actors [98.24047528960406]
We propose a new method for learning a generalized animatable neural representation from a sparse set of multi-view imagery of multiple persons.
The learned representation can be used to synthesize novel view images of an arbitrary person from a sparse set of cameras, and further animate them with the user's pose control.
arXiv Detail & Related papers (2022-08-25T07:36:46Z) - Neural Photofit: Gaze-based Mental Image Reconstruction [25.67771238116104]
We propose a novel method that leverages human fixations to visually decode the image a person has in mind into a photofit (facial composite)
Our method combines three neural networks: An encoder, a scoring network, and a decoder.
We show that our method significantly outperforms a mean baseline predictor and report on a human study that shows that we can decode photofits that are visually plausible and close to the observer's mental image.
arXiv Detail & Related papers (2021-08-17T09:11:32Z) - Neural Actor: Neural Free-view Synthesis of Human Actors with Pose
Control [80.79820002330457]
We propose a new method for high-quality synthesis of humans from arbitrary viewpoints and under arbitrary controllable poses.
Our method achieves better quality than the state-of-the-arts on playback as well as novel pose synthesis, and can even generalize well to new poses that starkly differ from the training poses.
arXiv Detail & Related papers (2021-06-03T17:40:48Z) - Neural Re-Rendering of Humans from a Single Image [80.53438609047896]
We propose a new method for neural re-rendering of a human under a novel user-defined pose and viewpoint.
Our algorithm represents body pose and shape as a parametric mesh which can be reconstructed from a single image.
arXiv Detail & Related papers (2021-01-11T18:53:47Z) - CONFIG: Controllable Neural Face Image Generation [10.443563719622645]
ConfigNet is a neural face model that allows for controlling individual aspects of output images in meaningful ways.
Our novel method uses synthetic data to factorize the latent space into elements that correspond to the inputs of a traditional rendering pipeline.
arXiv Detail & Related papers (2020-05-06T09:19:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.