Is Disentanglement enough? On Latent Representations for Controllable
Music Generation
- URL: http://arxiv.org/abs/2108.01450v1
- Date: Sun, 1 Aug 2021 18:37:43 GMT
- Title: Is Disentanglement enough? On Latent Representations for Controllable
Music Generation
- Authors: Ashis Pati, Alexander Lerch
- Abstract summary: In the absence of a strong generative decoder, disentanglement does not necessarily imply controllability.
The structure of the latent space with respect to the VAE-decoder plays an important role in boosting the ability of a generative model to manipulate different attributes.
- Score: 78.8942067357231
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Improving controllability or the ability to manipulate one or more attributes
of the generated data has become a topic of interest in the context of deep
generative models of music. Recent attempts in this direction have relied on
learning disentangled representations from data such that the underlying
factors of variation are well separated. In this paper, we focus on the
relationship between disentanglement and controllability by conducting a
systematic study using different supervised disentanglement learning algorithms
based on the Variational Auto-Encoder (VAE) architecture. Our experiments show
that a high degree of disentanglement can be achieved by using different forms
of supervision to train a strong discriminative encoder. However, in the
absence of a strong generative decoder, disentanglement does not necessarily
imply controllability. The structure of the latent space with respect to the
VAE-decoder plays an important role in boosting the ability of a generative
model to manipulate different attributes. To this end, we also propose methods
and metrics to help evaluate the quality of a latent space with respect to the
afforded degree of controllability.
Related papers
- Localized Gaussians as Self-Attention Weights for Point Clouds Correspondence [92.07601770031236]
We investigate semantically meaningful patterns in the attention heads of an encoder-only Transformer architecture.
We find that fixing the attention weights not only accelerates the training process but also enhances the stability of the optimization.
arXiv Detail & Related papers (2024-09-20T07:41:47Z) - Learning Action-based Representations Using Invariance [18.1941237781348]
We introduce action-bisimulation encoding, which learns a multi-step controllability metric that discounts distant state features that are relevant for control.
We demonstrate that action-bisimulation pretraining on reward-free, uniformly random data improves sample efficiency in several environments.
arXiv Detail & Related papers (2024-03-25T02:17:54Z) - Disentanglement via Latent Quantization [60.37109712033694]
In this work, we construct an inductive bias towards encoding to and decoding from an organized latent space.
We demonstrate the broad applicability of this approach by adding it to both basic data-re (vanilla autoencoder) and latent-reconstructing (InfoGAN) generative models.
arXiv Detail & Related papers (2023-05-28T06:30:29Z) - Transformer-based Conditional Variational Autoencoder for Controllable
Story Generation [39.577220559911055]
We investigate large-scale latent variable models (LVMs) for neural story generation with objectives in two threads: generation effectiveness and controllability.
We advocate to revive latent variable modeling, essentially the power of representation learning, in the era of Transformers.
Specifically, we integrate latent representation vectors with a Transformer-based pre-trained architecture to build conditional variational autoencoder (CVAE)
arXiv Detail & Related papers (2021-01-04T08:31:11Z) - Unsupervised Controllable Generation with Self-Training [90.04287577605723]
controllable generation with GANs remains a challenging research problem.
We propose an unsupervised framework to learn a distribution of latent codes that control the generator through self-training.
Our framework exhibits better disentanglement compared to other variants such as the variational autoencoder.
arXiv Detail & Related papers (2020-07-17T21:50:35Z) - Learning perturbation sets for robust machine learning [97.6757418136662]
We use a conditional generator that defines the perturbation set over a constrained region of the latent space.
We measure the quality of our learned perturbation sets both quantitatively and qualitatively.
We leverage our learned perturbation sets to train models which are empirically and certifiably robust to adversarial image corruptions and adversarial lighting variations.
arXiv Detail & Related papers (2020-07-16T16:39:54Z) - Guided Variational Autoencoder for Disentanglement Learning [79.02010588207416]
We propose an algorithm, guided variational autoencoder (Guided-VAE), that is able to learn a controllable generative model by performing latent representation disentanglement learning.
We design an unsupervised strategy and a supervised strategy in Guided-VAE and observe enhanced modeling and controlling capability over the vanilla VAE.
arXiv Detail & Related papers (2020-04-02T20:49:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.