GATSY: Graph Attention Network for Music Artist Similarity
- URL: http://arxiv.org/abs/2311.00635v2
- Date: Sat, 05 Apr 2025 18:14:41 GMT
- Title: GATSY: Graph Attention Network for Music Artist Similarity
- Authors: Andrea Giuseppe Di Francesco, Giuliano Giampietro, Indro Spinelli, Danilo Comminiello,
- Abstract summary: GATSY is a novel recommendation system built upon graph attention networks and driven by a clusterized embedding of artists.<n>This paper introduces GATSY, a novel recommendation system built upon graph attention networks and driven by a clusterized embedding of artists.
- Score: 4.84315398254578
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The artist similarity quest has become a crucial subject in social and scientific contexts, driven by the desire to enhance music discovery according to user preferences. Modern research solutions facilitate music discovery according to user tastes. However, defining similarity among artists remains challenging due to its inherently subjective nature, which can impact recommendation accuracy. This paper introduces GATSY, a novel recommendation system built upon graph attention networks and driven by a clusterized embedding of artists. The proposed framework leverages the graph topology of the input data to achieve outstanding performance results without relying heavily on hand-crafted features. This flexibility allows us the inclusion of fictitious artists within a music dataset, facilitating connections between previously unlinked artists and enabling diverse recommendations from various and heterogeneous sources. Experimental results prove the effectiveness of the proposed method with respect to state-of-the-art solutions while maintaining flexibility. The code to reproduce these experiments is available at https://anonymous.4open.science/r/GATSY-Music_Artist_Similarity-4807/README.md.
Related papers
- MeLFusion: Synthesizing Music from Image and Language Cues using Diffusion Models [57.47799823804519]
We are inspired by how musicians compose music not just from a movie script, but also through visualizations.
We propose MeLFusion, a model that can effectively use cues from a textual description and the corresponding image to synthesize music.
Our exhaustive experimental evaluation suggests that adding visual information to the music synthesis pipeline significantly improves the quality of generated music.
arXiv Detail & Related papers (2024-06-07T06:38:59Z) - Link Me Baby One More Time: Social Music Discovery on Spotify [0.3495246564946556]
We use data from Spotify to investigate how a link sent from one user to another results in the receiver engaging with the music of the shared artist.
We consider several factors that may influence this process, such as the strength of the sender-receiver relationship, the user's role in the Spotify social network, their music social cohesion, and how similar the new artist is to the receiver's taste.
arXiv Detail & Related papers (2024-01-16T20:41:11Z) - Fairness Through Domain Awareness: Mitigating Popularity Bias For Music
Discovery [56.77435520571752]
We explore the intrinsic relationship between music discovery and popularity bias.
We propose a domain-aware, individual fairness-based approach which addresses popularity bias in graph neural network (GNNs) based recommender systems.
Our approach uses individual fairness to reflect a ground truth listening experience, i.e., if two songs sound similar, this similarity should be reflected in their representations.
arXiv Detail & Related papers (2023-08-28T14:12:25Z) - Quantized GAN for Complex Music Generation from Dance Videos [48.196705493763986]
We present Dance2Music-GAN (D2M-GAN), a novel adversarial multi-modal framework that generates musical samples conditioned on dance videos.
Our proposed framework takes dance video frames and human body motion as input, and learns to generate music samples that plausibly accompany the corresponding input.
arXiv Detail & Related papers (2022-04-01T17:53:39Z) - Contrastive Learning with Positive-Negative Frame Mask for Music
Representation [91.44187939465948]
This paper proposes a novel Positive-nEgative frame mask for Music Representation based on the contrastive learning framework, abbreviated as PEMR.
We devise a novel contrastive learning objective to accommodate both self-augmented positives/negatives sampled from the same music.
arXiv Detail & Related papers (2022-03-17T07:11:42Z) - Cold Start Similar Artists Ranking with Gravity-Inspired Graph
Autoencoders [18.395568778680207]
We model a cold start similar artists ranking problem as a link prediction task in a directed and attributed graph.
Then, we leverage a graph autoencoder architecture to learn node embedding representations from this graph, and to automatically rank the top-k most similar neighbors of new artists.
We empirically show the flexibility and the effectiveness of our framework, by addressing a real-world cold start similar artists ranking problem on a global music streaming service.
arXiv Detail & Related papers (2021-08-02T17:19:47Z) - Artist Similarity with Graph Neural Networks [1.160208922584163]
We present a hybrid approach to computing similarity between artists using graph neural networks trained with triplet loss.
The novelty of using a graph neural network architecture is to combine the topology of a graph of artist connections with content features to embed artists into a vector space that encodes similarity.
With 17,673 artists, this is the largest academic artist similarity dataset that includes content-based features to date.
arXiv Detail & Related papers (2021-07-30T10:44:31Z) - Music Gesture for Visual Sound Separation [121.36275456396075]
"Music Gesture" is a keypoint-based structured representation to explicitly model the body and finger movements of musicians when they perform music.
We first adopt a context-aware graph network to integrate visual semantic context with body dynamics, and then apply an audio-visual fusion model to associate body movements with the corresponding audio signals.
arXiv Detail & Related papers (2020-04-20T17:53:46Z) - Multi-Modal Music Information Retrieval: Augmenting Audio-Analysis with
Visual Computing for Improved Music Video Analysis [91.3755431537592]
This thesis combines audio-analysis with computer vision to approach Music Information Retrieval (MIR) tasks from a multi-modal perspective.
The main hypothesis of this work is based on the observation that certain expressive categories such as genre or theme can be recognized on the basis of the visual content alone.
The experiments are conducted for three MIR tasks Artist Identification, Music Genre Classification and Cross-Genre Classification.
arXiv Detail & Related papers (2020-02-01T17:57:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.