Metric Learning vs Classification for Disentangled Music Representation
Learning
- URL: http://arxiv.org/abs/2008.03729v2
- Date: Wed, 12 Aug 2020 21:46:52 GMT
- Title: Metric Learning vs Classification for Disentangled Music Representation
Learning
- Authors: Jongpil Lee, Nicholas J. Bryan, Justin Salamon, Zeyu Jin, Juhan Nam
- Abstract summary: We present a single representation learning framework that elucidates the relationship between metric learning, classification, and disentanglement in a holistic manner.
We find that classification-based models are generally advantageous for training time, similarity retrieval, and auto-tagging, while deep metric learning exhibits better performance for triplet-prediction.
- Score: 36.74680586571013
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep representation learning offers a powerful paradigm for mapping input
data onto an organized embedding space and is useful for many music information
retrieval tasks. Two central methods for representation learning include deep
metric learning and classification, both having the same goal of learning a
representation that can generalize well across tasks. Along with
generalization, the emerging concept of disentangled representations is also of
great interest, where multiple semantic concepts (e.g., genre, mood,
instrumentation) are learned jointly but remain separable in the learned
representation space. In this paper we present a single representation learning
framework that elucidates the relationship between metric learning,
classification, and disentanglement in a holistic manner. For this, we (1)
outline past work on the relationship between metric learning and
classification, (2) extend this relationship to multi-label data by exploring
three different learning approaches and their disentangled versions, and (3)
evaluate all models on four tasks (training time, similarity retrieval,
auto-tagging, and triplet prediction). We find that classification-based models
are generally advantageous for training time, similarity retrieval, and
auto-tagging, while deep metric learning exhibits better performance for
triplet-prediction. Finally, we show that our proposed approach yields
state-of-the-art results for music auto-tagging.
Related papers
- Separating common from salient patterns with Contrastive Representation
Learning [2.250968907999846]
Contrastive Analysis aims at separating common factors of variation between two datasets.
Current models based on Variational Auto-Encoders have shown poor performance in learning semantically-expressive representations.
We propose to leverage the ability of Contrastive Learning to learn semantically expressive representations well adapted for Contrastive Analysis.
arXiv Detail & Related papers (2024-02-19T08:17:13Z) - The Trade-off between Universality and Label Efficiency of
Representations from Contrastive Learning [32.15608637930748]
We show that there exists a trade-off between the two desiderata so that one may not be able to achieve both simultaneously.
We provide analysis using a theoretical data model and show that, while more diverse pre-training data result in more diverse features for different tasks, it puts less emphasis on task-specific features.
arXiv Detail & Related papers (2023-02-28T22:14:33Z) - Self-Supervised Visual Representation Learning with Semantic Grouping [50.14703605659837]
We tackle the problem of learning visual representations from unlabeled scene-centric data.
We propose contrastive learning from data-driven semantic slots, namely SlotCon, for joint semantic grouping and representation learning.
arXiv Detail & Related papers (2022-05-30T17:50:59Z) - Deep Relational Metric Learning [84.95793654872399]
This paper presents a deep relational metric learning framework for image clustering and retrieval.
We learn an ensemble of features that characterizes an image from different aspects to model both interclass and intraclass distributions.
Experiments on the widely-used CUB-200-2011, Cars196, and Stanford Online Products datasets demonstrate that our framework improves existing deep metric learning methods and achieves very competitive results.
arXiv Detail & Related papers (2021-08-23T09:31:18Z) - It Takes Two to Tango: Mixup for Deep Metric Learning [16.60855728302127]
State-of-the-art methods focus mostly on sophisticated loss functions or mining strategies.
Mixup is a powerful data augmentation approach interpolating two or more examples and corresponding target labels at a time.
We show that mixing inputs, intermediate representations or embeddings along with target labels significantly improves representations and outperforms state-of-the-art metric learning methods on four benchmark datasets.
arXiv Detail & Related papers (2021-06-09T11:20:03Z) - Prototypical Representation Learning for Relation Extraction [56.501332067073065]
This paper aims to learn predictive, interpretable, and robust relation representations from distantly-labeled data.
We learn prototypes for each relation from contextual information to best explore the intrinsic semantics of relations.
Results on several relation learning tasks show that our model significantly outperforms the previous state-of-the-art relational models.
arXiv Detail & Related papers (2021-03-22T08:11:43Z) - Concept Learners for Few-Shot Learning [76.08585517480807]
We propose COMET, a meta-learning method that improves generalization ability by learning to learn along human-interpretable concept dimensions.
We evaluate our model on few-shot tasks from diverse domains, including fine-grained image classification, document categorization and cell type annotation.
arXiv Detail & Related papers (2020-07-14T22:04:17Z) - Batch Decorrelation for Active Metric Learning [21.99577268213412]
We present an active learning strategy for training parametric models of distance metrics, given triplet-based similarity assessments.
In contrast to prior work on class-based learning, we focus on em metrics that express the em degree of (dis)similarity between objects.
arXiv Detail & Related papers (2020-05-20T12:47:48Z) - Memory-Augmented Relation Network for Few-Shot Learning [114.47866281436829]
In this work, we investigate a new metric-learning method, Memory-Augmented Relation Network (MRN)
In MRN, we choose the samples that are visually similar from the working context, and perform weighted information propagation to attentively aggregate helpful information from chosen ones to enhance its representation.
We empirically demonstrate that MRN yields significant improvement over its ancestor and achieves competitive or even better performance when compared with other few-shot learning approaches.
arXiv Detail & Related papers (2020-05-09T10:09:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.