Related papers: Exploring Variational Auto-Encoder Architectures, Configurations, and Datasets for Generative Music Explainable AI

Exploring Variational Auto-Encoder Architectures, Configurations, and Datasets for Generative Music Explainable AI

URL: http://arxiv.org/abs/2311.08336v1
Date: Tue, 14 Nov 2023 17:27:30 GMT
Title: Exploring Variational Auto-Encoder Architectures, Configurations, and Datasets for Generative Music Explainable AI
Authors: Nick Bryan-Kinns, Bingyuan Zhang, Songyan Zhao and Berker Banar
Abstract summary: Generative AI models for music and the arts are increasingly complex and hard to understand. One approach to making generative AI models more understandable is to impose a small number of semantically meaningful attributes on generative AI models. This paper contributes a systematic examination of the impact that different combinations of Variational Auto-Encoder models (MeasureVAE and AdversarialVAE) have on music generation performance.
Score: 7.391173255888337
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Generative AI models for music and the arts in general are increasingly complex and hard to understand. The field of eXplainable AI (XAI) seeks to make complex and opaque AI models such as neural networks more understandable to people. One approach to making generative AI models more understandable is to impose a small number of semantically meaningful attributes on generative AI models. This paper contributes a systematic examination of the impact that different combinations of Variational Auto-Encoder models (MeasureVAE and AdversarialVAE), configurations of latent space in the AI model (from 4 to 256 latent dimensions), and training datasets (Irish folk, Turkish folk, Classical, and pop) have on music generation performance when 2 or 4 meaningful musical attributes are imposed on the generative model. To date there have been no systematic comparisons of such models at this level of combinatorial detail. Our findings show that MeasureVAE has better reconstruction performance than AdversarialVAE which has better musical attribute independence. Results demonstrate that MeasureVAE was able to generate music across music genres with interpretable musical dimensions of control, and performs best with low complexity music such a pop and rock. We recommend that a 32 or 64 latent dimensional space is optimal for 4 regularised dimensions when using MeasureVAE to generate music across genres. Our results are the first detailed comparisons of configurations of state-of-the-art generative AI models for music and can be used to help select and configure AI models, musical features, and datasets for more understandable generation of music.

Related papers

A Multimodal Symphony: Integrating Taste and Sound through Generative AI [1.2749527861829049]
This article explores multimodal generative models capable of converting taste information into music. We present an experiment in which a fine-tuned version of a generative music model (MusicGEN) is used to generate music based on detailed taste descriptions provided for each musical piece.
arXiv Detail & Related papers (2025-03-04T17:48:48Z)
Reducing Barriers to the Use of Marginalised Music Genres in AI [7.140590440016289]
This project aims to explore the eXplainable AI (XAI) challenges and opportunities associated with reducing barriers to using marginalised genres of music with AI models. XAI opportunities identified included topics of improving transparency and control of AI models, explaining the ethics and bias of AI models, fine tuning large models with small datasets to reduce bias, and explaining style-transfer opportunities with AI models. We are now building on this project to bring together a global International Responsible AI Music community and invite people to join our network.
arXiv Detail & Related papers (2024-07-18T12:10:04Z)
AI-Generated Images as Data Source: The Dawn of Synthetic Era [61.879821573066216]
generative AI has unlocked the potential to create synthetic images that closely resemble real-world photographs. This paper explores the innovative concept of harnessing these AI-generated images as new data sources. In contrast to real data, AI-generated data exhibit remarkable advantages, including unmatched abundance and scalability.
arXiv Detail & Related papers (2023-10-03T06:55:19Z)
An Autoethnographic Exploration of XAI in Algorithmic Composition [7.775986202112564]
This paper introduces an autoethnographic study of the use of the MeasureVAE generative music XAI model with interpretable latent dimensions trained on Irish music. Findings suggest that the exploratory nature of the music-making workflow foregrounds musical features of the training dataset rather than features of the generative model itself.
arXiv Detail & Related papers (2023-08-11T12:03:17Z)
Exploring XAI for the Arts: Explaining Latent Space in Generative Music [5.91328657300926]
We show how a latent variable model for music generation can be made more explainable. We use latent space regularisation to force some specific dimensions of the latent space to map to meaningful musical attributes. We also provide a visualisation of the musical attributes in the latent space to help people understand and predict the effect of changes to latent space dimensions.
arXiv Detail & Related papers (2023-08-10T10:59:24Z)
MARBLE: Music Audio Representation Benchmark for Universal Evaluation [79.25065218663458]
We introduce the Music Audio Representation Benchmark for universaL Evaluation, termed MARBLE. It aims to provide a benchmark for various Music Information Retrieval (MIR) tasks by defining a comprehensive taxonomy with four hierarchy levels, including acoustic, performance, score, and high-level description. We then establish a unified protocol based on 14 tasks on 8 public-available datasets, providing a fair and standard assessment of representations of all open-sourced pre-trained models developed on music recordings as baselines.
arXiv Detail & Related papers (2023-06-18T12:56:46Z)
Simple and Controllable Music Generation [94.61958781346176]
MusicGen is a single Language Model (LM) that operates over several streams of compressed discrete music representation, i.e., tokens. Unlike prior work, MusicGen is comprised of a single-stage transformer LM together with efficient token interleaving patterns.
arXiv Detail & Related papers (2023-06-08T15:31:05Z)
A Comprehensive Survey of AI-Generated Content (AIGC): A History of Generative AI from GAN to ChatGPT [63.58711128819828]
ChatGPT and other Generative AI (GAI) techniques belong to the category of Artificial Intelligence Generated Content (AIGC) The goal of AIGC is to make the content creation process more efficient and accessible, allowing for the production of high-quality content at a faster pace.
arXiv Detail & Related papers (2023-03-07T20:36:13Z)
Greedy Hierarchical Variational Autoencoders for Large-Scale Video Prediction [79.23730812282093]
We introduce Greedy Hierarchical Variational Autoencoders (GHVAEs), a method that learns high-fidelity video predictions by greedily training each level of a hierarchical autoencoder. GHVAEs provide 17-55% gains in prediction performance on four video datasets, a 35-40% higher success rate on real robot tasks, and can improve performance monotonically by simply adding more modules.
arXiv Detail & Related papers (2021-03-06T18:58:56Z)
AI Song Contest: Human-AI Co-Creation in Songwriting [8.399688944263843]
We present findings on what 13 musician/developer teams, a total of 61 users, needed when co-creating a song with AI. We show how they leveraged and repurposed existing characteristics of AI to overcome some of these challenges. Findings reflect a need to design machine learning-powered music interfaces that are more decomposable, steerable, interpretable, and adaptive.
arXiv Detail & Related papers (2020-10-12T01:27:41Z)
RL-Duet: Online Music Accompaniment Generation Using Deep Reinforcement Learning [69.20460466735852]
This paper presents a deep reinforcement learning algorithm for online accompaniment generation. The proposed algorithm is able to respond to the human part and generate a melodic, harmonic and diverse machine part.
arXiv Detail & Related papers (2020-02-08T03:53:52Z)
Learning Style-Aware Symbolic Music Representations by Adversarial Autoencoders [9.923470453197657]
We focus on leveraging adversarial regularization as a flexible and natural mean to imbue variational autoencoders with context information. We introduce the first Music Adversarial Autoencoder (MusAE) Our model has a higher reconstruction accuracy than state-of-the-art models based on standard variational autoencoders.
arXiv Detail & Related papers (2020-01-15T18:07:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.