A Contextual Latent Space Model: Subsequence Modulation in Melodic
Sequence
- URL: http://arxiv.org/abs/2111.11703v1
- Date: Tue, 23 Nov 2021 07:51:39 GMT
- Title: A Contextual Latent Space Model: Subsequence Modulation in Melodic
Sequence
- Authors: Taketo Akama
- Abstract summary: Some generative models for sequences such as music and text allow us to edit only subsequences, given surrounding context sequences.
We propose a contextual latent space model (M) in order for users to be able to explore subsequence generation with a sense of direction in the generation space.
A context-informed prior and decoder constitute the generative model of CLSM, and a context position-informed is the inference model.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Some generative models for sequences such as music and text allow us to edit
only subsequences, given surrounding context sequences, which plays an
important part in steering generation interactively. However, editing
subsequences mainly involves randomly resampling subsequences from a possible
generation space. We propose a contextual latent space model (CLSM) in order
for users to be able to explore subsequence generation with a sense of
direction in the generation space, e.g., interpolation, as well as exploring
variations -- semantically similar possible subsequences. A context-informed
prior and decoder constitute the generative model of CLSM, and a context
position-informed encoder is the inference model. In experiments, we use a
monophonic symbolic music dataset, demonstrating that our contextual latent
space is smoother in interpolation than baselines, and the quality of generated
samples is superior to baseline models. The generation examples are available
online.
Related papers
- SequenceMatch: Imitation Learning for Autoregressive Sequence Modelling with Backtracking [60.109453252858806]
A maximum-likelihood (MLE) objective does not match a downstream use-case of autoregressively generating high-quality sequences.
We formulate sequence generation as an imitation learning (IL) problem.
This allows us to minimize a variety of divergences between the distribution of sequences generated by an autoregressive model and sequences from a dataset.
Our resulting method, SequenceMatch, can be implemented without adversarial training or architectural changes.
arXiv Detail & Related papers (2023-06-08T17:59:58Z) - SeqDiffuSeq: Text Diffusion with Encoder-Decoder Transformers [50.90457644954857]
In this work, we apply diffusion models to approach sequence-to-sequence text generation.
We propose SeqDiffuSeq, a text diffusion model for sequence-to-sequence generation.
Experiment results illustrate the good performance on sequence-to-sequence generation in terms of text quality and inference time.
arXiv Detail & Related papers (2022-12-20T15:16:24Z) - DiffusER: Discrete Diffusion via Edit-based Reconstruction [88.62707047517914]
DiffusER is an edit-based generative model for text based on denoising diffusion models.
It can rival autoregressive models on several tasks spanning machine translation, summarization, and style transfer.
It can also perform other varieties of generation that standard autoregressive models are not well-suited for.
arXiv Detail & Related papers (2022-10-30T16:55:23Z) - Calibrating Sequence likelihood Improves Conditional Language Generation [39.35161650538767]
Conditional language models are predominantly trained with maximum likelihood estimation (MLE)
While MLE trained models assign high probability to plausible sequences given the context, the model probabilities often do not accurately rank-order generated sequences by quality.
We introduce sequence likelihood calibration (SLiC) where the likelihood of model generated sequences are calibrated to better align with reference sequences in the model's latent space.
arXiv Detail & Related papers (2022-09-30T19:16:16Z) - G2P-DDM: Generating Sign Pose Sequence from Gloss Sequence with Discrete
Diffusion Model [8.047896755805981]
The Sign Language Production project aims to automatically translate spoken languages into sign sequences.
We present a novel solution by converting the continuous pose space generation problem into a discrete sequence generation problem.
Our results show that our model outperforms state-of-the-art G2P models on the public SLP evaluation benchmark.
arXiv Detail & Related papers (2022-08-19T03:49:13Z) - Structured Reordering for Modeling Latent Alignments in Sequence
Transduction [86.94309120789396]
We present an efficient dynamic programming algorithm performing exact marginal inference of separable permutations.
The resulting seq2seq model exhibits better systematic generalization than standard models on synthetic problems and NLP tasks.
arXiv Detail & Related papers (2021-06-06T21:53:54Z) - Conditional Hybrid GAN for Sequence Generation [56.67961004064029]
We propose a novel conditional hybrid GAN (C-Hybrid-GAN) to solve this issue.
We exploit the Gumbel-Softmax technique to approximate the distribution of discrete-valued sequences.
We demonstrate that the proposed C-Hybrid-GAN outperforms the existing methods in context-conditioned discrete-valued sequence generation.
arXiv Detail & Related papers (2020-09-18T03:52:55Z) - Improve Variational Autoencoder for Text Generationwith Discrete Latent
Bottleneck [52.08901549360262]
Variational autoencoders (VAEs) are essential tools in end-to-end representation learning.
VAEs tend to ignore latent variables with a strong auto-regressive decoder.
We propose a principled approach to enforce an implicit latent feature matching in a more compact latent space.
arXiv Detail & Related papers (2020-04-22T14:41:37Z) - Vector Quantized Contrastive Predictive Coding for Template-based Music
Generation [0.0]
We propose a flexible method for generating variations of discrete sequences in which tokens can be grouped into basic units.
We show how these compressed representations can be used to generate variations of a template sequence by using an appropriate attention pattern in the Transformer architecture.
arXiv Detail & Related papers (2020-04-21T15:58:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.