Self-Similarity-Based and Novelty-based loss for music structure
analysis
- URL: http://arxiv.org/abs/2309.02243v1
- Date: Tue, 5 Sep 2023 13:49:29 GMT
- Title: Self-Similarity-Based and Novelty-based loss for music structure
analysis
- Authors: Geoffroy Peeters
- Abstract summary: We propose a supervised approach for the task of music boundary detection.
In our approach we simultaneously learn features and convolution kernels.
We demonstrate that relative feature learning, through self-attention, is beneficial for the task of MSA.
- Score: 5.3900692419866285
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Music Structure Analysis (MSA) is the task aiming at identifying musical
segments that compose a music track and possibly label them based on their
similarity. In this paper we propose a supervised approach for the task of
music boundary detection. In our approach we simultaneously learn features and
convolution kernels. For this we jointly optimize -- a loss based on the
Self-Similarity-Matrix (SSM) obtained with the learned features, denoted by
SSM-loss, and -- a loss based on the novelty score obtained applying the
learned kernels to the estimated SSM, denoted by novelty-loss. We also
demonstrate that relative feature learning, through self-attention, is
beneficial for the task of MSA. Finally, we compare the performances of our
approach to previously proposed approaches on the standard RWC-Pop, and various
subsets of SALAMI.
Related papers
- An Information Compensation Framework for Zero-Shot Skeleton-based Action Recognition [49.45660055499103]
Zero-shot human skeleton-based action recognition aims to construct a model that can recognize actions outside the categories seen during training.
Previous research has focused on aligning sequences' visual and semantic spatial distributions.
We introduce a new loss function sampling method to obtain a tight and robust representation.
arXiv Detail & Related papers (2024-06-02T06:53:01Z) - "It's a Match!" -- A Benchmark of Task Affinity Scores for Joint
Learning [74.14961250042629]
Multi-Task Learning (MTL) promises attractive, characterizing the conditions of its success is still an open problem in Deep Learning.
Estimateing task affinity for joint learning is a key endeavor.
Recent work suggests that the training conditions themselves have a significant impact on the outcomes of MTL.
Yet, the literature is lacking a benchmark to assess the effectiveness of tasks affinity estimation techniques.
arXiv Detail & Related papers (2023-01-07T15:16:35Z) - SSM-Net: feature learning for Music Structure Analysis using a
Self-Similarity-Matrix based loss [7.599399338954308]
We train a deep encoder to learn features such that the Self-Similarity-Matrix (SSM) resulting from those approximates a ground-truth SSM.
We successfully demonstrate the use of this training paradigm using the Area Under the Curve ROC (AUC) on the RWC-Pop dataset.
arXiv Detail & Related papers (2022-11-15T13:48:11Z) - Joint Embedding Self-Supervised Learning in the Kernel Regime [21.80241600638596]
Self-supervised learning (SSL) produces useful representations of data without access to any labels for classifying the data.
We extend this framework to incorporate algorithms based on kernel methods where embeddings are constructed by linear maps acting on the feature space of a kernel.
We analyze our kernel model on small datasets to identify common features of self-supervised learning algorithms and gain theoretical insights into their performance on downstream tasks.
arXiv Detail & Related papers (2022-09-29T15:53:19Z) - SeCo: Separating Unknown Musical Visual Sounds with Consistency Guidance [88.0355290619761]
This work focuses on the separation of unknown musical instruments.
We propose the Separation-with-Consistency (SeCo) framework, which can accomplish the separation on unknown categories.
Our framework exhibits strong adaptation ability on the novel musical categories and outperforms the baseline methods by a significant margin.
arXiv Detail & Related papers (2022-03-25T09:42:11Z) - Barwise Compression Schemes for Audio-Based Music Structure Analysis [4.39160562548524]
Music Structure Analysis (MSA) consists in segmenting a music piece in several distinct sections.
We approach MSA within a compression framework, under the hypothesis that the structure is more easily revealed by a simplified representation of the original content of the song.
In our experiments, several unsupervised compression schemes achieve a level of performance comparable to that of state-of-the-art supervised methods.
arXiv Detail & Related papers (2022-02-10T12:23:57Z) - MAML is a Noisy Contrastive Learner [72.04430033118426]
Model-agnostic meta-learning (MAML) is one of the most popular and widely-adopted meta-learning algorithms nowadays.
We provide a new perspective to the working mechanism of MAML and discover that: MAML is analogous to a meta-learner using a supervised contrastive objective function.
We propose a simple but effective technique, zeroing trick, to alleviate such interference.
arXiv Detail & Related papers (2021-06-29T12:52:26Z) - How Fine-Tuning Allows for Effective Meta-Learning [50.17896588738377]
We present a theoretical framework for analyzing representations derived from a MAML-like algorithm.
We provide risk bounds on the best predictor found by fine-tuning via gradient descent, demonstrating that the algorithm can provably leverage the shared structure.
This separation result underscores the benefit of fine-tuning-based methods, such as MAML, over methods with "frozen representation" objectives in few-shot learning.
arXiv Detail & Related papers (2021-05-05T17:56:00Z) - Music Boundary Detection using Convolutional Neural Networks: A
comparative analysis of combined input features [2.123556187010023]
The analysis of the structure of musical pieces is a task that remains a challenge for Artificial Intelligence.
We establish a general method of pre-processing these inputs by comparing the inputs calculated from different pooling strategies.
We also establish the most effective combination of inputs to be delivered to the CNN in order to establish the most efficient way to extract the limits of the structure of the music pieces.
arXiv Detail & Related papers (2020-08-17T14:20:51Z) - Score-informed Networks for Music Performance Assessment [64.12728872707446]
Deep neural network-based methods incorporating score information into MPA models have not yet been investigated.
We introduce three different models capable of score-informed performance assessment.
arXiv Detail & Related papers (2020-08-01T07:46:24Z) - Modeling Musical Structure with Artificial Neural Networks [0.0]
I explore the application of artificial neural networks to different aspects of musical structure modeling.
I show how a connectionist model, the Gated Autoencoder (GAE), can be employed to learn transformations between musical fragments.
I propose a special predictive training of the GAE, which yields a representation of polyphonic music as a sequence of intervals.
arXiv Detail & Related papers (2020-01-06T18:35:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.