A Comprehensive Survey for Evaluation Methodologies of AI-Generated
Music
- URL: http://arxiv.org/abs/2308.13736v1
- Date: Sat, 26 Aug 2023 02:44:33 GMT
- Title: A Comprehensive Survey for Evaluation Methodologies of AI-Generated
Music
- Authors: Zeyu Xiong, Weitao Wang, Jing Yu, Yue Lin, Ziyan Wang
- Abstract summary: This study aims to comprehensively evaluate the subjective, objective, and combined methodologies for assessing AI-generated music.
Ultimately, this study provides a valuable reference for unifying generative AI in the field of music evaluation.
- Score: 14.453416870193072
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In recent years, AI-generated music has made significant progress, with
several models performing well in multimodal and complex musical genres and
scenes. While objective metrics can be used to evaluate generative music, they
often lack interpretability for musical evaluation. Therefore, researchers
often resort to subjective user studies to assess the quality of the generated
works, which can be resource-intensive and less reproducible than objective
metrics. This study aims to comprehensively evaluate the subjective, objective,
and combined methodologies for assessing AI-generated music, highlighting the
advantages and disadvantages of each approach. Ultimately, this study provides
a valuable reference for unifying generative AI in the field of music
evaluation.
Related papers
- Applications and Advances of Artificial Intelligence in Music Generation:A Review [0.04551615447454769]
This paper provides a systematic review of the latest research advancements in AI music generation.
It covers key technologies, models, datasets, evaluation methods, and their practical applications across various fields.
arXiv Detail & Related papers (2024-09-03T13:50:55Z) - Foundation Models for Music: A Survey [77.77088584651268]
Foundations models (FMs) have profoundly impacted diverse sectors, including music.
This comprehensive review examines state-of-the-art (SOTA) pre-trained models and foundation models in music.
arXiv Detail & Related papers (2024-08-26T15:13:14Z) - Towards Explainable and Interpretable Musical Difficulty Estimation: A Parameter-efficient Approach [49.2787113554916]
Estimating music piece difficulty is important for organizing educational music collections.
Our work employs explainable descriptors for difficulty estimation in symbolic music representations.
Our approach, evaluated in piano repertoire categorized in 9 classes, achieved 41.4% accuracy independently, with a mean squared error (MSE) of 1.7.
arXiv Detail & Related papers (2024-08-01T11:23:42Z) - Between the AI and Me: Analysing Listeners' Perspectives on AI- and Human-Composed Progressive Metal Music [1.2874569408514918]
We explore participants' perspectives on AI- vs human-generated progressive metal, using rock music as a control group.
We propose a mixed methods approach to assess the effects of generation type (human vs. AI), genre (progressive metal vs. rock), and curation process (random vs. cherry-picked)
Our findings validate the use of fine-tuning to achieve genre-specific specialization in AI music generation.
Despite some AI-generated excerpts receiving similar ratings to human music, listeners exhibited a preference for human compositions.
arXiv Detail & Related papers (2024-07-31T14:03:45Z) - Towards Assessing Data Replication in Music Generation with Music Similarity Metrics on Raw Audio [25.254669525489923]
We present a model-independent open evaluation method based on diverse audio music similarity metrics to assess data replication.
Our results show that the proposed methodology can estimate exact data replication with a proportion higher than 10%.
arXiv Detail & Related papers (2024-07-19T14:52:11Z) - Deep Learning-Based Object Pose Estimation: A Comprehensive Survey [73.74933379151419]
We discuss the recent advances in deep learning-based object pose estimation.
Our survey also covers multiple input data modalities, degrees-of-freedom of output poses, object properties, and downstream tasks.
arXiv Detail & Related papers (2024-05-13T14:44:22Z) - A Survey of Music Generation in the Context of Interaction [3.6522809408725223]
Machine learning has been successfully used to compose and generate music, both melodies and polyphonic pieces.
Most of these models are not suitable for human-machine co-creation through live interaction.
arXiv Detail & Related papers (2024-02-23T12:41:44Z) - MARBLE: Music Audio Representation Benchmark for Universal Evaluation [79.25065218663458]
We introduce the Music Audio Representation Benchmark for universaL Evaluation, termed MARBLE.
It aims to provide a benchmark for various Music Information Retrieval (MIR) tasks by defining a comprehensive taxonomy with four hierarchy levels, including acoustic, performance, score, and high-level description.
We then establish a unified protocol based on 14 tasks on 8 public-available datasets, providing a fair and standard assessment of representations of all open-sourced pre-trained models developed on music recordings as baselines.
arXiv Detail & Related papers (2023-06-18T12:56:46Z) - An Order-Complexity Model for Aesthetic Quality Assessment of Symbolic
Homophony Music Scores [8.751312368054016]
The quality of music score generated by AI is relatively poor compared with that created by human composers.
This paper proposes an objective quantitative evaluation method for homophony music score aesthetic quality assessment.
arXiv Detail & Related papers (2023-01-14T12:30:16Z) - Exploring the Efficacy of Pre-trained Checkpoints in Text-to-Music
Generation Task [86.72661027591394]
We generate complete and semantically consistent symbolic music scores from text descriptions.
We explore the efficacy of using publicly available checkpoints for natural language processing in the task of text-to-music generation.
Our experimental results show that the improvement from using pre-trained checkpoints is statistically significant in terms of BLEU score and edit distance similarity.
arXiv Detail & Related papers (2022-11-21T07:19:17Z) - Multi-Modal Music Information Retrieval: Augmenting Audio-Analysis with
Visual Computing for Improved Music Video Analysis [91.3755431537592]
This thesis combines audio-analysis with computer vision to approach Music Information Retrieval (MIR) tasks from a multi-modal perspective.
The main hypothesis of this work is based on the observation that certain expressive categories such as genre or theme can be recognized on the basis of the visual content alone.
The experiments are conducted for three MIR tasks Artist Identification, Music Genre Classification and Cross-Genre Classification.
arXiv Detail & Related papers (2020-02-01T17:57:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.