Reinforcement Learning Jazz Improvisation: When Music Meets Game Theory
- URL: http://arxiv.org/abs/2403.03224v1
- Date: Sun, 25 Feb 2024 16:46:15 GMT
- Title: Reinforcement Learning Jazz Improvisation: When Music Meets Game Theory
- Authors: Vedant Tapiavala, Joshua Piesner, Sourjyamoy Barman, Feng Fu
- Abstract summary: We introduce a novel mathematical game theory model for jazz improvisation.
We use reinforcement learning to explore diverse improvisational strategies and their paired performance.
Our work lays the foundation for promising applications beyond jazz.
- Score: 0.24578723416255752
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Live performances of music are always charming, with the unpredictability of
improvisation due to the dynamic between musicians and interactions with the
audience. Jazz improvisation is a particularly noteworthy example for further
investigation from a theoretical perspective. Here, we introduce a novel
mathematical game theory model for jazz improvisation, providing a framework
for studying music theory and improvisational methodologies. We use
computational modeling, mainly reinforcement learning, to explore diverse
stochastic improvisational strategies and their paired performance on
improvisation. We find that the most effective strategy pair is a strategy that
reacts to the most recent payoff (Stepwise Changes) with a reinforcement
learning strategy limited to notes in the given chord (Chord-Following
Reinforcement Learning). Conversely, a strategy that reacts to the partner's
last note and attempts to harmonize with it (Harmony Prediction) strategy pair
yields the lowest non-control payoff and highest standard deviation, indicating
that picking notes based on immediate reactions to the partner player can yield
inconsistent outcomes. On average, the Chord-Following Reinforcement Learning
strategy demonstrates the highest mean payoff, while Harmony Prediction
exhibits the lowest. Our work lays the foundation for promising applications
beyond jazz: including the use of artificial intelligence (AI) models to
extract data from audio clips to refine musical reward systems, and training
machine learning (ML) models on existing jazz solos to further refine
strategies within the game.
Related papers
- ImprovNet: Generating Controllable Musical Improvisations with Iterative Corruption Refinement [6.873190001575463]
ImprovNet is a transformer-based architecture that generates expressive and controllable musical improvisations.
It can perform cross-genre and intra-genre improvisations, harmonize melodies with genre-specific styles, and execute short prompt continuation and infilling tasks.
arXiv Detail & Related papers (2025-02-06T21:45:38Z) - MuseBarControl: Enhancing Fine-Grained Control in Symbolic Music Generation through Pre-Training and Counterfactual Loss [51.85076222868963]
We introduce a pre-training task designed to link control signals directly with corresponding musical tokens.
We then implement a novel counterfactual loss that promotes better alignment between the generated music and the control prompts.
arXiv Detail & Related papers (2024-07-05T08:08:22Z) - MuPT: A Generative Symbolic Music Pretrained Transformer [56.09299510129221]
We explore the application of Large Language Models (LLMs) to the pre-training of music.
To address the challenges associated with misaligned measures from different tracks during generation, we propose a Synchronized Multi-Track ABC Notation (SMT-ABC Notation)
Our contributions include a series of models capable of handling up to 8192 tokens, covering 90% of the symbolic music data in our training set.
arXiv Detail & Related papers (2024-04-09T15:35:52Z) - Algorithmic Collective Action in Recommender Systems: Promoting Songs by Reordering Playlists [10.681288493631978]
We investigate algorithmic collective action in transformer-based recommender systems.
Our use case is a music streaming platform where a collective of fans aims to promote the visibility of an underrepresented artist.
We introduce two easily implementable strategies to select the position at which to insert the song with the goal to boost recommendations at test time.
arXiv Detail & Related papers (2024-03-19T23:27:15Z) - MERMAIDE: Learning to Align Learners using Model-Based Meta-Learning [62.065503126104126]
We study how a principal can efficiently and effectively intervene on the rewards of a previously unseen learning agent in order to induce desirable outcomes.
This is relevant to many real-world settings like auctions or taxation, where the principal may not know the learning behavior nor the rewards of real people.
We introduce MERMAIDE, a model-based meta-learning framework to train a principal that can quickly adapt to out-of-distribution agents.
arXiv Detail & Related papers (2023-04-10T15:44:50Z) - Imitating, Fast and Slow: Robust learning from demonstrations via
decision-time planning [96.72185761508668]
Planning at Test-time (IMPLANT) is a new meta-algorithm for imitation learning.
We demonstrate that IMPLANT significantly outperforms benchmark imitation learning approaches on standard control environments.
arXiv Detail & Related papers (2022-04-07T17:16:52Z) - A Ranking Game for Imitation Learning [22.028680861819215]
We treat imitation as a two-player ranking-based Stackelberg game between a $textitpolicy$ and a $textitreward$ function.
This game encompasses a large subset of both inverse reinforcement learning (IRL) methods and methods which learn from offline preferences.
We theoretically analyze the requirements of the loss function used for ranking policy performances to facilitate near-optimal imitation learning at equilibrium.
arXiv Detail & Related papers (2022-02-07T19:38:22Z) - The Jazz Transformer on the Front Line: Exploring the Shortcomings of
AI-composed Music through Quantitative Measures [36.49582705724548]
This paper presents the Jazz Transformer, a generative model that utilizes a neural sequence model called the Transformer-XL for modeling lead sheets of Jazz music.
We then conduct a series of computational analysis of the generated compositions from different perspectives.
Our work presents in an analytical manner why machine-generated music to date still falls short of the artwork of humanity, and sets some goals for future work on automatic composition to further pursue.
arXiv Detail & Related papers (2020-08-04T03:32:59Z) - Learning to Play Sequential Games versus Unknown Opponents [93.8672371143881]
We consider a repeated sequential game between a learner, who plays first, and an opponent who responds to the chosen action.
We propose a novel algorithm for the learner when playing against an adversarial sequence of opponents.
Our results include algorithm's regret guarantees that depend on the regularity of the opponent's response.
arXiv Detail & Related papers (2020-07-10T09:33:05Z) - RL-Duet: Online Music Accompaniment Generation Using Deep Reinforcement
Learning [69.20460466735852]
This paper presents a deep reinforcement learning algorithm for online accompaniment generation.
The proposed algorithm is able to respond to the human part and generate a melodic, harmonic and diverse machine part.
arXiv Detail & Related papers (2020-02-08T03:53:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.