Related papers: Content-based Music Similarity with Triplet Networks

Content-based Music Similarity with Triplet Networks

URL: http://arxiv.org/abs/2008.04938v1
Date: Tue, 11 Aug 2020 18:10:02 GMT
Title: Content-based Music Similarity with Triplet Networks
Authors: Joseph Cleveland, Derek Cheng, Michael Zhou, Thorsten Joachims, Douglass Turnbull
Abstract summary: We explore the feasibility of using triplet neural networks to embed songs based on content-based music similarity. Our network is trained using triplets of songs such that two songs by the same artist are embedded closer to one another than to a third song by a different artist.
Score: 21.220806977978853
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We explore the feasibility of using triplet neural networks to embed songs based on content-based music similarity. Our network is trained using triplets of songs such that two songs by the same artist are embedded closer to one another than to a third song by a different artist. We compare two models that are trained using different ways of picking this third song: at random vs. based on shared genre labels. Our experiments are conducted using songs from the Free Music Archive and use standard audio features. The initial results show that shallow Siamese networks can be used to embed music for a simple artist retrieval task.

Related papers

MusicFlow: Cascaded Flow Matching for Text Guided Music Generation [53.63948108922333]
MusicFlow is a cascaded text-to-music generation model based on flow matching. We leverage masked prediction as the training objective, enabling the model to generalize to other tasks such as music infilling and continuation.
arXiv Detail & Related papers (2024-10-27T15:35:41Z)
Link Me Baby One More Time: Social Music Discovery on Spotify [0.3495246564946556]
We use data from Spotify to investigate how a link sent from one user to another results in the receiver engaging with the music of the shared artist. We consider several factors that may influence this process, such as the strength of the sender-receiver relationship, the user's role in the Spotify social network, their music social cohesion, and how similar the new artist is to the receiver's taste.
arXiv Detail & Related papers (2024-01-16T20:41:11Z)
Fairness Through Domain Awareness: Mitigating Popularity Bias For Music Discovery [56.77435520571752]
We explore the intrinsic relationship between music discovery and popularity bias. We propose a domain-aware, individual fairness-based approach which addresses popularity bias in graph neural network (GNNs) based recommender systems. Our approach uses individual fairness to reflect a ground truth listening experience, i.e., if two songs sound similar, this similarity should be reflected in their representations.
arXiv Detail & Related papers (2023-08-28T14:12:25Z)
From West to East: Who can understand the music of the others better? [91.78564268397139]
We leverage transfer learning methods to derive insights about similarities between different music cultures. We use two Western music datasets, two traditional/folk datasets coming from eastern Mediterranean cultures, and two datasets belonging to Indian art music. Three deep audio embedding models are trained and transferred across domains, including two CNN-based and a Transformer-based architecture, to perform auto-tagging for each target domain dataset.
arXiv Detail & Related papers (2023-07-19T07:29:14Z)
Unsupervised Melody-to-Lyric Generation [91.29447272400826]
We propose a method for generating high-quality lyrics without training on any aligned melody-lyric data. We leverage the segmentation and rhythm alignment between melody and lyrics to compile the given melody into decoding constraints. Our model can generate high-quality lyrics that are more on-topic, singable, intelligible, and coherent than strong baselines.
arXiv Detail & Related papers (2023-05-30T17:20:25Z)
GETMusic: Generating Any Music Tracks with a Unified Representation and Diffusion Framework [58.64512825534638]
Symbolic music generation aims to create musical notes, which can help users compose music. We introduce a framework known as GETMusic, with GET'' standing for GEnerate music Tracks'' GETScore represents musical notes as tokens and organizes tokens in a 2D structure, with tracks stacked vertically and progressing horizontally over time. Our proposed representation, coupled with the non-autoregressive generative model, empowers GETMusic to generate music with any arbitrary source-target track combinations.
arXiv Detail & Related papers (2023-05-18T09:53:23Z)
Musical Audio Similarity with Self-supervised Convolutional Neural Networks [0.0]
We have built a music similarity search engine that lets video producers search by listenable music excerpts. Our system suggests similar sounding track segments in a large music catalog by training a self-supervised convolutional neural network.
arXiv Detail & Related papers (2022-02-04T12:51:16Z)
Artist Similarity with Graph Neural Networks [1.160208922584163]
We present a hybrid approach to computing similarity between artists using graph neural networks trained with triplet loss. The novelty of using a graph neural network architecture is to combine the topology of a graph of artist connections with content features to embed artists into a vector space that encodes similarity. With 17,673 artists, this is the largest academic artist similarity dataset that includes content-based features to date.
arXiv Detail & Related papers (2021-07-30T10:44:31Z)
Disentangled Multidimensional Metric Learning for Music Similarity [36.74680586571013]
Music similarity search is useful for replacing one music recording with another recording with a similar "feel" Music similarity is hard to define and depends on multiple simultaneous notions of similarity. We introduce the concept of multidimensional similarity and unify both global and specialized similarity metrics into a single metric.
arXiv Detail & Related papers (2020-08-09T13:04:25Z)
Learning to rank music tracks using triplet loss [6.43271391521664]
We propose a method for direct recommendation based on the audio content without explicitly tagging the music tracks. We train a Convolutional Neural Network to learn the similarity via triplet loss. Results highlight the efficiency of our system, especially when associated with an Auto-pooling layer.
arXiv Detail & Related papers (2020-05-18T08:20:54Z)
Music Gesture for Visual Sound Separation [121.36275456396075]
"Music Gesture" is a keypoint-based structured representation to explicitly model the body and finger movements of musicians when they perform music. We first adopt a context-aware graph network to integrate visual semantic context with body dynamics, and then apply an audio-visual fusion model to associate body movements with the corresponding audio signals.
arXiv Detail & Related papers (2020-04-20T17:53:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.