Tensor-based Subspace Factorization for StyleGAN
- URL: http://arxiv.org/abs/2111.04554v1
- Date: Mon, 8 Nov 2021 15:11:39 GMT
- Title: Tensor-based Subspace Factorization for StyleGAN
- Authors: Ren\'e Haas, Stella Gra{\ss}hof and Sami Sebastian Brandt
- Abstract summary: $tau$GAN is a tensor-based method for modeling the latent space of generative models.
We validate our approach on StyleGAN trained on FFHQ using BU-3DFE as a structured facial expression database.
- Score: 1.1470070927586016
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose $\tau$GAN a tensor-based method for modeling the
latent space of generative models. The objective is to identify semantic
directions in latent space. To this end, we propose to fit a multilinear tensor
model on a structured facial expression database, which is initially embedded
into latent space. We validate our approach on StyleGAN trained on FFHQ using
BU-3DFE as a structured facial expression database. We show how the parameters
of the multilinear tensor model can be approximated by Alternating Least
Squares. Further, we introduce a tacked style-separated tensor model, defined
as an ensemble of style-specific models to integrate our approach with the
extended latent space of StyleGAN. We show that taking the individual styles of
the extended latent space into account leads to higher model flexibility and
lower reconstruction error. Finally, we do several experiments comparing our
approach to former work on both GANs and multilinear models. Concretely, we
analyze the expression subspace and find that the expression trajectories meet
at an apathetic face that is consistent with earlier work. We also show that by
changing the pose of a person, the generated image from our approach is closer
to the ground truth than results from two competing approaches.
Related papers
- From Semantics to Hierarchy: A Hybrid Euclidean-Tangent-Hyperbolic Space Model for Temporal Knowledge Graph Reasoning [1.1372536310854844]
Temporal knowledge graph (TKG) reasoning predicts future events based on historical data.
Existing Euclidean models excel at capturing semantics but struggle with hierarchy.
We propose a novel hybrid geometric space approach that leverages the strengths of both Euclidean and hyperbolic models.
arXiv Detail & Related papers (2024-08-30T10:33:08Z) - FUSE-ing Language Models: Zero-Shot Adapter Discovery for Prompt Optimization Across Tokenizers [55.2480439325792]
We propose FUSE, an approach to approximating an adapter layer that maps from one model's textual embedding space to another, even across different tokenizers.
We show the efficacy of our approach via multi-objective optimization over vision-language and causal language models for image captioning and sentiment-based image captioning.
arXiv Detail & Related papers (2024-08-09T02:16:37Z) - HiCAST: Highly Customized Arbitrary Style Transfer with Adapter Enhanced
Diffusion Models [84.12784265734238]
The goal of Arbitrary Style Transfer (AST) is injecting the artistic features of a style reference into a given image/video.
We propose HiCAST, which is capable of explicitly customizing the stylization results according to various source of semantic clues.
A novel learning objective is leveraged for video diffusion model training, which significantly improve cross-frame temporal consistency.
arXiv Detail & Related papers (2024-01-11T12:26:23Z) - Dynamic Point Fields [30.029872787758705]
We present a dynamic point field model that combines the representational benefits of explicit point-based graphics with implicit deformation networks.
We show the advantages of our dynamic point field framework in terms of its representational power, learning efficiency, and robustness to out-of-distribution novel poses.
arXiv Detail & Related papers (2023-04-05T17:52:37Z) - VTAE: Variational Transformer Autoencoder with Manifolds Learning [144.0546653941249]
Deep generative models have demonstrated successful applications in learning non-linear data distributions through a number of latent variables.
The nonlinearity of the generator implies that the latent space shows an unsatisfactory projection of the data space, which results in poor representation learning.
We show that geodesics and accurate computation can substantially improve the performance of deep generative models.
arXiv Detail & Related papers (2023-04-03T13:13:19Z) - On the Transformation of Latent Space in Fine-Tuned NLP Models [21.364053591693175]
We study the evolution of latent space in fine-tuned NLP models.
We discover latent concepts in the representational space using hierarchical clustering.
We compare pre-trained and fine-tuned models across three models and three downstream tasks.
arXiv Detail & Related papers (2022-10-23T10:59:19Z) - Counting Phases and Faces Using Bayesian Thermodynamic Integration [77.34726150561087]
We introduce a new approach to reconstruction of the thermodynamic functions and phase boundaries in two-parametric statistical mechanics systems.
We use the proposed approach to accurately reconstruct the partition functions and phase diagrams of the Ising model and the exactly solvable non-equilibrium TASEP.
arXiv Detail & Related papers (2022-05-18T17:11:23Z) - Latent Space Model for Higher-order Networks and Generalized Tensor
Decomposition [18.07071669486882]
We introduce a unified framework, formulated as general latent space models, to study complex higher-order network interactions.
We formulate the relationship between the latent positions and the observed data via a generalized multilinear kernel as the link function.
We demonstrate the effectiveness of our method on synthetic data.
arXiv Detail & Related papers (2021-06-30T13:11:17Z) - Bridge the Gap Between Model-based and Model-free Human Reconstruction [10.818838437018682]
We present an end-to-end neural network that simultaneously predicts the pixel-aligned implicit surface and the explicit mesh model built by graph convolutional neural network.
Experiments on DeepHuman dataset showed that our approach is effective.
arXiv Detail & Related papers (2021-06-11T11:13:42Z) - Closed-Form Factorization of Latent Semantics in GANs [65.42778970898534]
A rich set of interpretable dimensions has been shown to emerge in the latent space of the Generative Adversarial Networks (GANs) trained for synthesizing images.
In this work, we examine the internal representation learned by GANs to reveal the underlying variation factors in an unsupervised manner.
We propose a closed-form factorization algorithm for latent semantic discovery by directly decomposing the pre-trained weights.
arXiv Detail & Related papers (2020-07-13T18:05:36Z) - CoSE: Compositional Stroke Embeddings [52.529172734044664]
We present a generative model for complex free-form structures such as stroke-based drawing tasks.
Our approach is suitable for interactive use cases such as auto-completing diagrams.
arXiv Detail & Related papers (2020-06-17T15:22:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.