A Dataset for Greek Traditional and Folk Music: Lyra
- URL: http://arxiv.org/abs/2211.11479v1
- Date: Mon, 21 Nov 2022 14:15:43 GMT
- Title: A Dataset for Greek Traditional and Folk Music: Lyra
- Authors: Charilaos Papaioannou, Ioannis Valiantzas, Theodoros Giannakopoulos,
Maximos Kaliakatsos-Papakostas, Alexandros Potamianos
- Abstract summary: This paper presents a dataset for Greek Traditional and Folk music that includes 1570 pieces, summing in around 80 hours of data.
The dataset incorporates YouTube timestamped links for retrieving audio and video, along with rich metadata information with regards to instrumentation, geography and genre.
- Score: 69.07390994897443
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Studying under-represented music traditions under the MIR scope is crucial,
not only for developing novel analysis tools, but also for unveiling musical
functions that might prove useful in studying world musics. This paper presents
a dataset for Greek Traditional and Folk music that includes 1570 pieces,
summing in around 80 hours of data. The dataset incorporates YouTube
timestamped links for retrieving audio and video, along with rich metadata
information with regards to instrumentation, geography and genre, among others.
The content has been collected from a Greek documentary series that is
available online, where academics present music traditions of Greece with live
music and dance performance during the show, along with discussions about
social, cultural and musicological aspects of the presented music. Therefore,
this procedure has resulted in a significant wealth of descriptions regarding a
variety of aspects, such as musical genre, places of origin and musical
instruments. In addition, the audio recordings were performed under strict
production-level specifications, in terms of recording equipment, leading to
very clean and homogeneous audio content. In this work, apart from presenting
the dataset in detail, we propose a baseline deep-learning classification
approach to recognize the involved musicological attributes. The dataset, the
baseline classification methods and the models are provided in public
repositories. Future directions for further refining the dataset are also
discussed.
Related papers
- CHORDONOMICON: A Dataset of 666,000 Songs and their Chord Progressions [1.8541450825478398]
Chordonomicon is a dataset of over 666,000 songs and their chord progressions, annotated with structural parts, genre, and release date.
These characteristics make the Chordonomicon an ideal testbed for exploring advanced machine learning techniques.
arXiv Detail & Related papers (2024-10-29T13:53:09Z) - Foundation Models for Music: A Survey [77.77088584651268]
Foundations models (FMs) have profoundly impacted diverse sectors, including music.
This comprehensive review examines state-of-the-art (SOTA) pre-trained models and foundation models in music.
arXiv Detail & Related papers (2024-08-26T15:13:14Z) - MeLFusion: Synthesizing Music from Image and Language Cues using Diffusion Models [57.47799823804519]
We are inspired by how musicians compose music not just from a movie script, but also through visualizations.
We propose MeLFusion, a model that can effectively use cues from a textual description and the corresponding image to synthesize music.
Our exhaustive experimental evaluation suggests that adding visual information to the music synthesis pipeline significantly improves the quality of generated music.
arXiv Detail & Related papers (2024-06-07T06:38:59Z) - MidiCaps: A large-scale MIDI dataset with text captions [6.806050368211496]
This work aims to enable research that combines LLMs with symbolic music by presenting, the first openly available large-scale MIDI dataset with text captions.
Inspired by recent advancements in captioning techniques, we present a curated dataset of over 168k MIDI files with textual descriptions.
arXiv Detail & Related papers (2024-06-04T12:21:55Z) - WikiMuTe: A web-sourced dataset of semantic descriptions for music audio [7.4327407361824935]
We present WikiMuTe, a new and open dataset containing rich semantic descriptions of music.
The data is sourced from Wikipedia's rich catalogue of articles covering musical works.
We train a model that jointly learns text and audio representations and performs cross-modal retrieval.
arXiv Detail & Related papers (2023-12-14T18:38:02Z) - The Music Meta Ontology: a flexible semantic model for the
interoperability of music metadata [0.39373541926236766]
We introduce the Music Meta ontology to describe music metadata related to artists, compositions, performances, recordings, and links.
We provide a first evaluation of the model, alignments to other schemas, and support for data transformation.
arXiv Detail & Related papers (2023-11-07T12:35:15Z) - From West to East: Who can understand the music of the others better? [91.78564268397139]
We leverage transfer learning methods to derive insights about similarities between different music cultures.
We use two Western music datasets, two traditional/folk datasets coming from eastern Mediterranean cultures, and two datasets belonging to Indian art music.
Three deep audio embedding models are trained and transferred across domains, including two CNN-based and a Transformer-based architecture, to perform auto-tagging for each target domain dataset.
arXiv Detail & Related papers (2023-07-19T07:29:14Z) - MARBLE: Music Audio Representation Benchmark for Universal Evaluation [79.25065218663458]
We introduce the Music Audio Representation Benchmark for universaL Evaluation, termed MARBLE.
It aims to provide a benchmark for various Music Information Retrieval (MIR) tasks by defining a comprehensive taxonomy with four hierarchy levels, including acoustic, performance, score, and high-level description.
We then establish a unified protocol based on 14 tasks on 8 public-available datasets, providing a fair and standard assessment of representations of all open-sourced pre-trained models developed on music recordings as baselines.
arXiv Detail & Related papers (2023-06-18T12:56:46Z) - Music-to-Text Synaesthesia: Generating Descriptive Text from Music
Recordings [36.090928638883454]
Music-to-text synaesthesia aims to generate descriptive texts from music recordings with the same sentiment for further understanding.
We build a computational model to generate sentences that can describe the content of the music recording.
To tackle the highly non-discriminative classical music, we design a group topology-preservation loss.
arXiv Detail & Related papers (2022-10-02T06:06:55Z) - MusicBERT: Symbolic Music Understanding with Large-Scale Pre-Training [97.91071692716406]
Symbolic music understanding refers to the understanding of music from the symbolic data.
MusicBERT is a large-scale pre-trained model for music understanding.
arXiv Detail & Related papers (2021-06-10T10:13:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.