Semi-supervised Neural Chord Estimation Based on a Variational
Autoencoder with Latent Chord Labels and Features
- URL: http://arxiv.org/abs/2005.07091v2
- Date: Tue, 8 Sep 2020 04:31:08 GMT
- Title: Semi-supervised Neural Chord Estimation Based on a Variational
Autoencoder with Latent Chord Labels and Features
- Authors: Yiming Wu, Tristan Carsault, Eita Nakamura, Kazuyoshi Yoshii
- Abstract summary: This paper describes a statistically-principled semi-supervised method of automatic chord estimation.
It can make effective use of music signals regardless of the availability of chord annotations.
- Score: 18.498244371257304
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper describes a statistically-principled semi-supervised method of
automatic chord estimation (ACE) that can make effective use of music signals
regardless of the availability of chord annotations. The typical approach to
ACE is to train a deep classification model (neural chord estimator) in a
supervised manner by using only annotated music signals. In this discriminative
approach, prior knowledge about chord label sequences (model output) has
scarcely been taken into account. In contrast, we propose a unified generative
and discriminative approach in the framework of amortized variational
inference. More specifically, we formulate a deep generative model that
represents the generative process of chroma vectors (observed variables) from
discrete labels and continuous features (latent variables), which are assumed
to follow a Markov model favoring self-transitions and a standard Gaussian
distribution, respectively. Given chroma vectors as observed data, the
posterior distributions of the latent labels and features are computed
approximately by using deep classification and recognition models,
respectively. These three models form a variational autoencoder and can be
trained jointly in a semi-supervised manner. The experimental results show that
the regularization of the classification model based on the Markov prior of
chord labels and the generative model of chroma vectors improved the
performance of ACE even under the supervised condition. The semi-supervised
learning using additional non-annotated data can further improve the
performance.
Related papers
- Exploring Beyond Logits: Hierarchical Dynamic Labeling Based on Embeddings for Semi-Supervised Classification [49.09505771145326]
We propose a Hierarchical Dynamic Labeling (HDL) algorithm that does not depend on model predictions and utilizes image embeddings to generate sample labels.
Our approach has the potential to change the paradigm of pseudo-label generation in semi-supervised learning.
arXiv Detail & Related papers (2024-04-26T06:00:27Z) - Score-based Continuous-time Discrete Diffusion Models [102.65769839899315]
We extend diffusion models to discrete variables by introducing a Markov jump process where the reverse process denoises via a continuous-time Markov chain.
We show that an unbiased estimator can be obtained via simple matching the conditional marginal distributions.
We demonstrate the effectiveness of the proposed method on a set of synthetic and real-world music and image benchmarks.
arXiv Detail & Related papers (2022-11-30T05:33:29Z) - Leveraging Instance Features for Label Aggregation in Programmatic Weak
Supervision [75.1860418333995]
Programmatic Weak Supervision (PWS) has emerged as a widespread paradigm to synthesize training labels efficiently.
The core component of PWS is the label model, which infers true labels by aggregating the outputs of multiple noisy supervision sources as labeling functions.
Existing statistical label models typically rely only on the outputs of LF, ignoring the instance features when modeling the underlying generative process.
arXiv Detail & Related papers (2022-10-06T07:28:53Z) - VAESim: A probabilistic approach for self-supervised prototype discovery [0.23624125155742057]
We propose an architecture for image stratification based on a conditional variational autoencoder.
We use a continuous latent space to represent the continuum of disorders and find clusters during training, which can then be used for image/patient stratification.
We demonstrate that our method outperforms baselines in terms of kNN accuracy measured on a classification task against a standard VAE.
arXiv Detail & Related papers (2022-09-25T17:55:31Z) - Gaussian Mixture Variational Autoencoder with Contrastive Learning for
Multi-Label Classification [27.043136219527767]
We propose a novel contrastive learning boosted multi-label prediction model.
By using contrastive learning in the supervised setting, we can exploit label information effectively.
We show that the learnt embeddings provide insights into the interpretation of label-label interactions.
arXiv Detail & Related papers (2021-12-02T04:23:34Z) - Consistency Regularization for Variational Auto-Encoders [14.423556966548544]
Variational auto-encoders (VAEs) are a powerful approach to unsupervised learning.
We propose a regularization method to enforce consistency in VAEs.
arXiv Detail & Related papers (2021-05-31T10:26:32Z) - Equivalence of Segmental and Neural Transducer Modeling: A Proof of
Concept [56.46135010588918]
We prove that the widely used class of RNN-Transducer models and segmental models (direct HMM) are equivalent.
It is shown that blank probabilities translate into segment length probabilities and vice versa.
arXiv Detail & Related papers (2021-04-13T11:20:48Z) - Autoencoding Variational Autoencoder [56.05008520271406]
We study the implications of this behaviour on the learned representations and also the consequences of fixing it by introducing a notion of self consistency.
We show that encoders trained with our self-consistency approach lead to representations that are robust (insensitive) to perturbations in the input introduced by adversarial attacks.
arXiv Detail & Related papers (2020-12-07T14:16:14Z) - Document Ranking with a Pretrained Sequence-to-Sequence Model [56.44269917346376]
We show how a sequence-to-sequence model can be trained to generate relevance labels as "target words"
Our approach significantly outperforms an encoder-only model in a data-poor regime.
arXiv Detail & Related papers (2020-03-14T22:29:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.