Can MusicGen Create Training Data for MIR Tasks?
- URL: http://arxiv.org/abs/2311.09094v1
- Date: Wed, 15 Nov 2023 16:41:56 GMT
- Title: Can MusicGen Create Training Data for MIR Tasks?
- Authors: Nadine Kroher, Helena Cuesta, Aggelos Pikrakis
- Abstract summary: We are investigating the broader concept of using AI-based generative music systems to generate training data for Music Information Retrieval tasks.
We constructed over 50 000 genre- conditioned textual descriptions and generated a collection of music excerpts that covers five musical genres.
Preliminary results show that the proposed model can learn genre-specific characteristics from artificial music tracks that generalise well to real-world music recordings.
- Score: 3.8980564330208662
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We are investigating the broader concept of using AI-based generative music
systems to generate training data for Music Information Retrieval (MIR) tasks.
To kick off this line of work, we ran an initial experiment in which we trained
a genre classifier on a fully artificial music dataset created with MusicGen.
We constructed over 50 000 genre- conditioned textual descriptions and
generated a collection of music excerpts that covers five musical genres. Our
preliminary results show that the proposed model can learn genre-specific
characteristics from artificial music tracks that generalise well to real-world
music recordings.
Related papers
- Audio Processing using Pattern Recognition for Music Genre Classification [0.0]
This project explores the application of machine learning techniques for music genre classification using the GTZAN dataset.
Motivated by the growing demand for personalized music recommendations, we focused on classifying five genres-Blues, Classical, Jazz, Hip Hop, and Country.
The ANN model demonstrated the best performance, achieving a validation accuracy of 92.44%.
arXiv Detail & Related papers (2024-10-19T05:44:05Z) - MuPT: A Generative Symbolic Music Pretrained Transformer [56.09299510129221]
We explore the application of Large Language Models (LLMs) to the pre-training of music.
To address the challenges associated with misaligned measures from different tracks during generation, we propose a Synchronized Multi-Track ABC Notation (SMT-ABC Notation)
Our contributions include a series of models capable of handling up to 8192 tokens, covering 90% of the symbolic music data in our training set.
arXiv Detail & Related papers (2024-04-09T15:35:52Z) - Music Genre Classification with ResNet and Bi-GRU Using Visual
Spectrograms [4.354842354272412]
The limitations of manual genre classification have highlighted the need for a more advanced system.
Traditional machine learning techniques have shown potential in genre classification, but fail to capture the full complexity of music data.
This study proposes a novel approach using visual spectrograms as input, and propose a hybrid model that combines the strength of the Residual neural Network (ResNet) and the Gated Recurrent Unit (GRU)
arXiv Detail & Related papers (2023-07-20T11:10:06Z) - MARBLE: Music Audio Representation Benchmark for Universal Evaluation [79.25065218663458]
We introduce the Music Audio Representation Benchmark for universaL Evaluation, termed MARBLE.
It aims to provide a benchmark for various Music Information Retrieval (MIR) tasks by defining a comprehensive taxonomy with four hierarchy levels, including acoustic, performance, score, and high-level description.
We then establish a unified protocol based on 14 tasks on 8 public-available datasets, providing a fair and standard assessment of representations of all open-sourced pre-trained models developed on music recordings as baselines.
arXiv Detail & Related papers (2023-06-18T12:56:46Z) - Simple and Controllable Music Generation [94.61958781346176]
MusicGen is a single Language Model (LM) that operates over several streams of compressed discrete music representation, i.e., tokens.
Unlike prior work, MusicGen is comprised of a single-stage transformer LM together with efficient token interleaving patterns.
arXiv Detail & Related papers (2023-06-08T15:31:05Z) - A Dataset for Greek Traditional and Folk Music: Lyra [69.07390994897443]
This paper presents a dataset for Greek Traditional and Folk music that includes 1570 pieces, summing in around 80 hours of data.
The dataset incorporates YouTube timestamped links for retrieving audio and video, along with rich metadata information with regards to instrumentation, geography and genre.
arXiv Detail & Related papers (2022-11-21T14:15:43Z) - Evaluating Deep Music Generation Methods Using Data Augmentation [13.72212417973239]
We focus on a homogeneous, objective framework for evaluating samples of algorithmically generated music.
We do not seek to assess the musical merit of generated music, but instead explore whether generated samples contain meaningful information pertaining to emotion or mood/theme.
arXiv Detail & Related papers (2021-12-31T20:35:46Z) - Personalized Popular Music Generation Using Imitation and Structure [1.971709238332434]
We propose a statistical machine learning model that is able to capture and imitate the structure, melody, chord, and bass style from a given example seed song.
An evaluation using 10 pop songs shows that our new representations and methods are able to create high-quality stylistic music.
arXiv Detail & Related papers (2021-05-10T23:43:00Z) - Artificial Musical Intelligence: A Survey [51.477064918121336]
Music has become an increasingly prevalent domain of machine learning and artificial intelligence research.
This article provides a definition of musical intelligence, introduces a taxonomy of its constituent components, and surveys the wide range of AI methods that can be, and have been, brought to bear in its pursuit.
arXiv Detail & Related papers (2020-06-17T04:46:32Z) - Multi-Modal Music Information Retrieval: Augmenting Audio-Analysis with
Visual Computing for Improved Music Video Analysis [91.3755431537592]
This thesis combines audio-analysis with computer vision to approach Music Information Retrieval (MIR) tasks from a multi-modal perspective.
The main hypothesis of this work is based on the observation that certain expressive categories such as genre or theme can be recognized on the basis of the visual content alone.
The experiments are conducted for three MIR tasks Artist Identification, Music Genre Classification and Cross-Genre Classification.
arXiv Detail & Related papers (2020-02-01T17:57:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.