From West to East: Who can understand the music of the others better?
- URL: http://arxiv.org/abs/2307.09795v1
- Date: Wed, 19 Jul 2023 07:29:14 GMT
- Title: From West to East: Who can understand the music of the others better?
- Authors: Charilaos Papaioannou, Emmanouil Benetos, Alexandros Potamianos
- Abstract summary: We leverage transfer learning methods to derive insights about similarities between different music cultures.
We use two Western music datasets, two traditional/folk datasets coming from eastern Mediterranean cultures, and two datasets belonging to Indian art music.
Three deep audio embedding models are trained and transferred across domains, including two CNN-based and a Transformer-based architecture, to perform auto-tagging for each target domain dataset.
- Score: 91.78564268397139
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent developments in MIR have led to several benchmark deep learning models
whose embeddings can be used for a variety of downstream tasks. At the same
time, the vast majority of these models have been trained on Western pop/rock
music and related styles. This leads to research questions on whether these
models can be used to learn representations for different music cultures and
styles, or whether we can build similar music audio embedding models trained on
data from different cultures or styles. To that end, we leverage transfer
learning methods to derive insights about the similarities between the
different music cultures to which the data belongs to. We use two Western music
datasets, two traditional/folk datasets coming from eastern Mediterranean
cultures, and two datasets belonging to Indian art music. Three deep audio
embedding models are trained and transferred across domains, including two
CNN-based and a Transformer-based architecture, to perform auto-tagging for
each target domain dataset. Experimental results show that competitive
performance is achieved in all domains via transfer learning, while the best
source dataset varies for each music culture. The implementation and the
trained models are both provided in a public repository.
Related papers
- Foundation Models for Music: A Survey [77.77088584651268]
Foundations models (FMs) have profoundly impacted diverse sectors, including music.
This comprehensive review examines state-of-the-art (SOTA) pre-trained models and foundation models in music.
arXiv Detail & Related papers (2024-08-26T15:13:14Z) - MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language Models [11.834712543531756]
MuChoMusic is a benchmark for evaluating music understanding in multimodal language models focused on audio.
It comprises 1,187 multiple-choice questions, all validated by human annotators, on 644 music tracks sourced from two publicly available music datasets.
We evaluate five open-source models and identify several pitfalls, including an over-reliance on the language modality.
arXiv Detail & Related papers (2024-08-02T15:34:05Z) - MuPT: A Generative Symbolic Music Pretrained Transformer [56.09299510129221]
We explore the application of Large Language Models (LLMs) to the pre-training of music.
To address the challenges associated with misaligned measures from different tracks during generation, we propose a Synchronized Multi-Track ABC Notation (SMT-ABC Notation)
Our contributions include a series of models capable of handling up to 8192 tokens, covering 90% of the symbolic music data in our training set.
arXiv Detail & Related papers (2024-04-09T15:35:52Z) - WikiMuTe: A web-sourced dataset of semantic descriptions for music audio [7.4327407361824935]
We present WikiMuTe, a new and open dataset containing rich semantic descriptions of music.
The data is sourced from Wikipedia's rich catalogue of articles covering musical works.
We train a model that jointly learns text and audio representations and performs cross-modal retrieval.
arXiv Detail & Related papers (2023-12-14T18:38:02Z) - A Dataset for Greek Traditional and Folk Music: Lyra [69.07390994897443]
This paper presents a dataset for Greek Traditional and Folk music that includes 1570 pieces, summing in around 80 hours of data.
The dataset incorporates YouTube timestamped links for retrieving audio and video, along with rich metadata information with regards to instrumentation, geography and genre.
arXiv Detail & Related papers (2022-11-21T14:15:43Z) - Contrastive Audio-Language Learning for Music [13.699088044513562]
MusCALL is a framework for Music Contrastive Audio-Language Learning.
Our approach consists of a dual-encoder architecture that learns the alignment between pairs of music audio and descriptive sentences.
arXiv Detail & Related papers (2022-08-25T16:55:15Z) - Listener Modeling and Context-aware Music Recommendation Based on
Country Archetypes [10.19712238203935]
Music preferences are strongly shaped by the cultural and socio-economic background of the listener.
We use state-of-the-art unsupervised learning techniques to investigate country profiles of music preferences on the fine-grained level of music tracks.
We propose a context-aware music recommendation system that leverages implicit user feedback.
arXiv Detail & Related papers (2020-09-11T17:59:04Z) - dMelodies: A Music Dataset for Disentanglement Learning [70.90415511736089]
We present a new symbolic music dataset that will help researchers demonstrate the efficacy of their algorithms on diverse domains.
This will also provide a means for evaluating algorithms specifically designed for music.
The dataset is large enough (approx. 1.3 million data points) to train and test deep networks for disentanglement learning.
arXiv Detail & Related papers (2020-07-29T19:20:07Z) - Music Gesture for Visual Sound Separation [121.36275456396075]
"Music Gesture" is a keypoint-based structured representation to explicitly model the body and finger movements of musicians when they perform music.
We first adopt a context-aware graph network to integrate visual semantic context with body dynamics, and then apply an audio-visual fusion model to associate body movements with the corresponding audio signals.
arXiv Detail & Related papers (2020-04-20T17:53:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.