Survey on the Evaluation of Generative Models in Music
- URL: http://arxiv.org/abs/2506.05104v1
- Date: Thu, 05 Jun 2025 14:46:04 GMT
- Title: Survey on the Evaluation of Generative Models in Music
- Authors: Alexander Lerch, Claire Arthur, Nick Bryan-Kinns, Corey Ford, Qianyi Sun, Ashvala Vinay,
- Abstract summary: We provide an interdisciplinary review of the common evaluation targets, methodologies, and metrics for the evaluation of generative systems in music.<n>We discuss the advantages and challenges of such approaches from a musicological, an engineering, and an HCI perspective.
- Score: 45.676474879179366
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Research on generative systems in music has seen considerable attention and growth in recent years. A variety of attempts have been made to systematically evaluate such systems. We provide an interdisciplinary review of the common evaluation targets, methodologies, and metrics for the evaluation of both system output and model usability, covering subjective and objective approaches, qualitative and quantitative approaches, as well as empirical and computational methods. We discuss the advantages and challenges of such approaches from a musicological, an engineering, and an HCI perspective.
Related papers
- A Survey on Interpretability in Visual Recognition [28.577223694381452]
This paper systematically reviews existing research on the interpretability of visual recognition models.<n>We propose a taxonomy of methods from a human-centered perspective.<n>We aim to organize existing research in this domain and inspire future investigations into the interpretability of visual recognition models.
arXiv Detail & Related papers (2025-07-15T08:45:54Z) - Large Language Model Psychometrics: A Systematic Review of Evaluation, Validation, and Enhancement [16.608577295968942]
The rapid advancement of large language models (LLMs) has outpaced traditional evaluation methodologies.<n>Psychometrics is the science of quantifying the intangible aspects of human psychology, such as personality, values, and intelligence.<n>This survey introduces and synthesizes an emerging interdisciplinary field of LLM Psychometrics.
arXiv Detail & Related papers (2025-05-13T05:47:51Z) - A Comprehensive Review on Hashtag Recommendation: From Traditional to Deep Learning and Beyond [0.37865171120254354]
Hashtags, as a fundamental categorization mechanism, play a pivotal role in enhancing content visibility and user engagement.<n>The development of accurate and robust hashtag recommendation systems remains a complex and evolving research challenge.<n>This review article conducts a systematic analysis of hashtag recommendation systems, examining recent advancements across several dimensions.
arXiv Detail & Related papers (2025-03-24T13:40:36Z) - Evaluating Human-AI Collaboration: A Review and Methodological Framework [4.41358655687435]
The use of artificial intelligence (AI) in working environments with individuals, known as Human-AI Collaboration (HAIC), has become essential.<n> evaluating HAIC's effectiveness remains challenging due to the complex interaction of components involved.<n>This paper provides a detailed analysis of existing HAIC evaluation approaches and develops a fresh paradigm for more effectively evaluating these systems.
arXiv Detail & Related papers (2024-07-09T12:52:22Z) - A survey on the impact of AI-based recommenders on human behaviours: methodologies, outcomes and future directions [3.802956917145726]
This survey analyses the impact of recommenders in four human-AI ecosystems.
Social media, online retail, urban mapping and generative AI ecosystems are studied.
arXiv Detail & Related papers (2024-06-29T14:34:32Z) - Are we making progress in unlearning? Findings from the first NeurIPS unlearning competition [70.60872754129832]
First NeurIPS competition on unlearning sought to stimulate the development of novel algorithms.
Nearly 1,200 teams from across the world participated.
We analyze top solutions and delve into discussions on benchmarking unlearning.
arXiv Detail & Related papers (2024-06-13T12:58:00Z) - Image Quality Assessment in the Modern Age [53.19271326110551]
This tutorial provides the audience with the basic theories, methodologies, and current progresses of image quality assessment (IQA)
We will first revisit several subjective quality assessment methodologies, with emphasis on how to properly select visual stimuli.
Both hand-engineered and (deep) learning-based methods will be covered.
arXiv Detail & Related papers (2021-10-19T02:38:46Z) - An Extensible Benchmark Suite for Learning to Simulate Physical Systems [60.249111272844374]
We introduce a set of benchmark problems to take a step towards unified benchmarks and evaluation protocols.
We propose four representative physical systems, as well as a collection of both widely used classical time-based and representative data-driven methods.
arXiv Detail & Related papers (2021-08-09T17:39:09Z) - Weakly Supervised Object Localization and Detection: A Survey [145.5041117184952]
weakly supervised object localization and detection plays an important role for developing new generation computer vision systems.
We review (1) classic models, (2) approaches with feature representations from off-the-shelf deep networks, (3) approaches solely based on deep learning, and (4) publicly available datasets and standard evaluation metrics that are widely used in this field.
We discuss the key challenges in this field, development history of this field, advantages/disadvantages of the methods in each category, relationships between methods in different categories, applications of the weakly supervised object localization and detection methods, and potential future directions to further promote the development of this research field
arXiv Detail & Related papers (2021-04-16T06:44:50Z) - Through the Data Management Lens: Experimental Analysis and Evaluation
of Fair Classification [75.49600684537117]
Data management research is showing an increasing presence and interest in topics related to data and algorithmic fairness.
We contribute a broad analysis of 13 fair classification approaches and additional variants, over their correctness, fairness, efficiency, scalability, and stability.
Our analysis highlights novel insights on the impact of different metrics and high-level approach characteristics on different aspects of performance.
arXiv Detail & Related papers (2021-01-18T22:55:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.