Related papers: Modeling the Music Genre Perception across Language-Bound Cultures

Modeling the Music Genre Perception across Language-Bound Cultures

URL: http://arxiv.org/abs/2010.06325v2
Date: Mon, 16 Nov 2020 11:43:50 GMT
Title: Modeling the Music Genre Perception across Language-Bound Cultures
Authors: Elena V. Epure and Guillaume Salha and Manuel Moussallam and Romain Hennequin
Abstract summary: We study the feasibility of obtaining relevant cross-lingual, culture-specific music genre annotations. We show that unsupervised cross-lingual music genre annotation is feasible with high accuracy. We introduce a new, domain-dependent cross-lingual corpus to benchmark state of the art multilingual pre-trained embedding models.
Score: 10.223656553455003
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The music genre perception expressed through human annotations of artists or albums varies significantly across language-bound cultures. These variations cannot be modeled as mere translations since we also need to account for cultural differences in the music genre perception. In this work, we study the feasibility of obtaining relevant cross-lingual, culture-specific music genre annotations based only on language-specific semantic representations, namely distributed concept embeddings and ontologies. Our study, focused on six languages, shows that unsupervised cross-lingual music genre annotation is feasible with high accuracy, especially when combining both types of representations. This approach of studying music genres is the most extensive to date and has many implications in musicology and music information retrieval. Besides, we introduce a new, domain-dependent cross-lingual corpus to benchmark state of the art multilingual pre-trained embedding models.

Related papers

GlobalMood: A cross-cultural benchmark for music emotion recognition [10.490374578193773]
'GlobalMood' is a novel cross-cultural benchmark dataset comprising 1,180 songs sampled from 59 countries.<n>We implement a bottom-up, participant-driven approach to elicit culturally specific music-related emotion terms.
arXiv Detail & Related papers (2025-05-14T16:32:45Z)
Music for All: Exploring Multicultural Representations in Music Generation Models [13.568559786822457]
We present a study of the datasets and research papers for music generation. We find that only 5.7% of the total hours of existing music datasets come from non-Western genres.
arXiv Detail & Related papers (2025-02-11T07:46:29Z)
Multi-label Cross-lingual automatic music genre classification from lyrics with Sentence BERT [0.13654846342364302]
We present a multi-label, cross-lingual genre classification system based on multilingual sentence embeddings generated by sBERT. Using a bilingual Portuguese-English dataset with eight overlapping genres, we demonstrate the system's ability to train on lyrics in one language and predict genres in another.
arXiv Detail & Related papers (2025-01-07T13:22:35Z)
MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language Models [11.834712543531756]
MuChoMusic is a benchmark for evaluating music understanding in multimodal language models focused on audio. It comprises 1,187 multiple-choice questions, all validated by human annotators, on 644 music tracks sourced from two publicly available music datasets. We evaluate five open-source models and identify several pitfalls, including an over-reliance on the language modality.
arXiv Detail & Related papers (2024-08-02T15:34:05Z)
Understanding Cross-Lingual Alignment -- A Survey [52.572071017877704]
Cross-lingual alignment is the meaningful similarity of representations across languages in multilingual language models. We survey the literature of techniques to improve cross-lingual alignment, providing a taxonomy of methods and summarising insights from throughout the field.
arXiv Detail & Related papers (2024-04-09T11:39:53Z)
Multi-lingual and Multi-cultural Figurative Language Understanding [69.47641938200817]
Figurative language permeates human communication, but is relatively understudied in NLP. We create a dataset for seven diverse languages associated with a variety of cultures: Hindi, Indonesian, Javanese, Kannada, Sundanese, Swahili and Yoruba. Our dataset reveals that each language relies on cultural and regional concepts for figurative expressions, with the highest overlap between languages originating from the same region. All languages exhibit a significant deficiency compared to English, with variations in performance reflecting the availability of pre-training and fine-tuning data.
arXiv Detail & Related papers (2023-05-25T15:30:31Z)
A Dataset for Greek Traditional and Folk Music: Lyra [69.07390994897443]
This paper presents a dataset for Greek Traditional and Folk music that includes 1570 pieces, summing in around 80 hours of data. The dataset incorporates YouTube timestamped links for retrieving audio and video, along with rich metadata information with regards to instrumentation, geography and genre.
arXiv Detail & Related papers (2022-11-21T14:15:43Z)
Analyzing Gender Representation in Multilingual Models [59.21915055702203]
We focus on the representation of gender distinctions as a practical case study. We examine the extent to which the gender concept is encoded in shared subspaces across different languages.
arXiv Detail & Related papers (2022-04-20T00:13:01Z)
Genre-conditioned Acoustic Models for Automatic Lyrics Transcription of Polyphonic Music [73.73045854068384]
We propose to transcribe the lyrics of polyphonic music using a novel genre-conditioned network. The proposed network adopts pre-trained model parameters, and incorporates the genre adapters between layers to capture different genre peculiarities for lyrics-genre pairs. Our experiments show that the proposed genre-conditioned network outperforms the existing lyrics transcription systems.
arXiv Detail & Related papers (2022-04-07T09:15:46Z)
Deception detection in text and its relation to the cultural dimension of individualism/collectivism [6.17866386107486]
We investigate if differences in the usage of specific linguistic features of deception across cultures can be confirmed and attributed to norms in respect to the individualism/collectivism divide. We create culture/language-aware classifiers by experimenting with a wide range of n-gram features based on phonology, morphology and syntax. We conducted our experiments over 11 datasets from 5 languages i.e., English, Dutch, Russian, Spanish and Romanian, from six countries (US, Belgium, India, Russia, Mexico and Romania)
arXiv Detail & Related papers (2021-05-26T13:09:47Z)
Multilingual Music Genre Embeddings for Effective Cross-Lingual Music Item Annotation [9.709229853995987]
Cross-lingual music genre translation is possible without relying on a parallel corpus. By learning multilingual music genre embeddings, we enable cross-lingual music genre translation without relying on a parallel corpus. Our method is effective in translating music genres across tag systems in multiple languages.
arXiv Detail & Related papers (2020-09-16T15:39:04Z)
Listener Modeling and Context-aware Music Recommendation Based on Country Archetypes [10.19712238203935]
Music preferences are strongly shaped by the cultural and socio-economic background of the listener. We use state-of-the-art unsupervised learning techniques to investigate country profiles of music preferences on the fine-grained level of music tracks. We propose a context-aware music recommendation system that leverages implicit user feedback.
arXiv Detail & Related papers (2020-09-11T17:59:04Z)
Bridging Linguistic Typology and Multilingual Machine Translation with Multi-View Language Representations [83.27475281544868]
We use singular vector canonical correlation analysis to study what kind of information is induced from each source. We observe that our representations embed typology and strengthen correlations with language relationships. We then take advantage of our multi-view language vector space for multilingual machine translation, where we achieve competitive overall translation accuracy.
arXiv Detail & Related papers (2020-04-30T16:25:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.