Serenade: A Model for Human-in-the-loop Automatic Chord Estimation
- URL: http://arxiv.org/abs/2310.11165v1
- Date: Tue, 17 Oct 2023 11:31:29 GMT
- Title: Serenade: A Model for Human-in-the-loop Automatic Chord Estimation
- Authors: Hendrik Vincent Koops, Gianluca Micchi, Ilaria Manco, Elio Quinton
- Abstract summary: We show that a human-in-the-loop approach improves harmonic analysis performance over a model-only approach.
We evaluate our model on a dataset of popular music and show that, with this human-in-the-loop approach, harmonic analysis performance improves over a model-only approach.
- Score: 1.6385815610837167
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Computational harmony analysis is important for MIR tasks such as automatic
segmentation, corpus analysis and automatic chord label estimation. However,
recent research into the ambiguous nature of musical harmony, causing limited
inter-rater agreement, has made apparent that there is a glass ceiling for
common metrics such as accuracy. Commonly, these issues are addressed either in
the training data itself by creating majority-rule annotations or during the
training phase by learning soft targets. We propose a novel alternative
approach in which a human and an autoregressive model together co-create a
harmonic annotation for an audio track. After automatically generating harmony
predictions, a human sparsely annotates parts with low model confidence and the
model then adjusts its predictions following human guidance. We evaluate our
model on a dataset of popular music and we show that, with this
human-in-the-loop approach, harmonic analysis performance improves over a
model-only approach. The human contribution is amplified by the second,
constrained prediction of the model.
Related papers
- Stem-JEPA: A Joint-Embedding Predictive Architecture for Musical Stem Compatibility Estimation [3.8570045844185237]
We present Stem-JEPA, a novel Joint-Embedding Predictive Architecture (JEPA) trained on a multi-track dataset.
Our model comprises two networks: an encoder and a predictor, which are jointly trained to predict the embeddings of compatible stems.
We evaluate our model's performance on a retrieval task on the MUSDB18 dataset, testing its ability to find the missing stem from a mix.
arXiv Detail & Related papers (2024-08-05T14:34:40Z) - Automatic Equalization for Individual Instrument Tracks Using Convolutional Neural Networks [2.5944208050492183]
We propose a novel approach for the automatic equalization of individual musical instrument tracks.
Our method begins by identifying the instrument present within a source recording in order to choose its corresponding ideal spectrum as a target.
We build upon a differentiable parametric equalizer matching neural network, demonstrating improvements relative to previously established state-of-the-art.
arXiv Detail & Related papers (2024-07-23T17:55:25Z) - Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models.
This creates a barrier to fusing knowledge across individual models to yield a better single model.
We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z) - Are Neural Topic Models Broken? [81.15470302729638]
We study the relationship between automated and human evaluation of topic models.
We find that neural topic models fare worse in both respects compared to an established classical method.
arXiv Detail & Related papers (2022-10-28T14:38:50Z) - Is Automated Topic Model Evaluation Broken?: The Incoherence of
Coherence [62.826466543958624]
We look at the standardization gap and the validation gap in topic model evaluation.
Recent models relying on neural components surpass classical topic models according to these metrics.
We use automatic coherence along with the two most widely accepted human judgment tasks, namely, topic rating and word intrusion.
arXiv Detail & Related papers (2021-07-05T17:58:52Z) - Test-time Collective Prediction [73.74982509510961]
Multiple parties in machine learning want to jointly make predictions on future test points.
Agents wish to benefit from the collective expertise of the full set of agents, but may not be willing to release their data or model parameters.
We explore a decentralized mechanism to make collective predictions at test time, leveraging each agent's pre-trained model.
arXiv Detail & Related papers (2021-06-22T18:29:58Z) - Generative Temporal Difference Learning for Infinite-Horizon Prediction [101.59882753763888]
We introduce the $gamma$-model, a predictive model of environment dynamics with an infinite probabilistic horizon.
We discuss how its training reflects an inescapable tradeoff between training-time and testing-time compounding errors.
arXiv Detail & Related papers (2020-10-27T17:54:12Z) - Generating Music with a Self-Correcting Non-Chronological Autoregressive
Model [6.289267097017553]
We describe a novel approach for generating music using a self-correcting, non-chronological, autoregressive model.
We represent music as a sequence of edit events, each of which denotes either the addition or removal of a note.
During inference, we generate one edit event at a time using direct ancestral sampling.
arXiv Detail & Related papers (2020-08-18T20:36:47Z) - A Spectral Energy Distance for Parallel Speech Synthesis [29.14723501889278]
Speech synthesis is an important practical generative modeling problem.
We propose a new learning method that allows us to train highly parallel models of speech.
arXiv Detail & Related papers (2020-08-03T19:56:04Z) - Forecasting Sequential Data using Consistent Koopman Autoencoders [52.209416711500005]
A new class of physics-based methods related to Koopman theory has been introduced, offering an alternative for processing nonlinear dynamical systems.
We propose a novel Consistent Koopman Autoencoder model which, unlike the majority of existing work, leverages the forward and backward dynamics.
Key to our approach is a new analysis which explores the interplay between consistent dynamics and their associated Koopman operators.
arXiv Detail & Related papers (2020-03-04T18:24:30Z) - Automatic Melody Harmonization with Triad Chords: A Comparative Study [24.95868747256647]
We present a comparative study evaluating and comparing the performance of a set of canonical approaches to this task.
The evaluation is conducted on a dataset of 9,226 melody/chord pairs we newly collect for this study, considering up to 48 triad chords.
arXiv Detail & Related papers (2020-01-08T03:47:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.