An Agent-Based Framework for Automated Higher-Voice Harmony Generation
- URL: http://arxiv.org/abs/2509.24463v2
- Date: Wed, 01 Oct 2025 07:00:41 GMT
- Title: An Agent-Based Framework for Automated Higher-Voice Harmony Generation
- Authors: Nia D'Souza Ganapathy, Arul Selvamani Shaja,
- Abstract summary: Our framework comprises four specialized agents: a Music-Ingestion Agent for parsing and standardizing input musical scores; a Chord-Knowledge Agent, powered by a Chord-Former (Transformer model), to interpret and provide the constituent notes of complex chord symbols; and a Harmony-Generation Agent, which composes a melodically and rhythmically complementary harmony line.<n>By delegating specific tasks to specialized agents, our system effectively mimics the collaborative process of human musicians.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The generation of musically coherent and aesthetically pleasing harmony remains a significant challenge in the field of algorithmic composition. This paper introduces an innovative Agentic AI-enabled Higher Harmony Music Generator, a multi-agent system designed to create harmony in a collaborative and modular fashion. Our framework comprises four specialized agents: a Music-Ingestion Agent for parsing and standardizing input musical scores; a Chord-Knowledge Agent, powered by a Chord-Former (Transformer model), to interpret and provide the constituent notes of complex chord symbols; a Harmony-Generation Agent, which utilizes a Harmony-GPT and a Rhythm-Net (RNN) to compose a melodically and rhythmically complementary harmony line; and an Audio-Production Agent that employs a GAN-based Symbolic-to-Audio Synthesizer to render the final symbolic output into high-fidelity audio. By delegating specific tasks to specialized agents, our system effectively mimics the collaborative process of human musicians. This modular, agent-based approach allows for robust data processing, deep theoretical understanding, creative composition, and realistic audio synthesis, culminating in a system capable of generating sophisticated and contextually appropriate higher-voice harmonies for given melodies.
Related papers
- The Ghost in the Keys: A Disklavier Demo for Human-AI Musical Co-Creativity [59.78509280246215]
Aria-Duet is an interactive system facilitating a real-time musical duet between a human pianist and Aria, a state-of-the-art generative model.<n>We analyze the system's output from a musicological perspective, finding the model can maintain stylistic semantics and develop coherent phrasal ideas.
arXiv Detail & Related papers (2025-11-03T15:26:01Z) - MusicSwarm: Biologically Inspired Intelligence for Music Composition [1.3537117504260623]
We show that coherent, long-form musical composition can emerge from a decentralized swarm of identical, frozen foundation models.<n>We compare a centralized multi-agent system with a global critic to a fully decentralized swarm in which bar-wise agents sense and deposit harmonic, rhythmic, and structural cues, adapt short-term memory, and reach consensus.
arXiv Detail & Related papers (2025-09-15T14:23:09Z) - AI Harmonizer: Expanding Vocal Expression with a Generative Neurosymbolic Music AI System [3.356609500886644]
The AI Harmonizer autonomously generates musically coherent four-part harmonies without requiring prior harmonic input from the user.<n>We present our methods, explore potential applications in performance and composition, and discuss future directions for real-time implementations.
arXiv Detail & Related papers (2025-06-22T19:13:31Z) - An End-to-End Approach for Chord-Conditioned Song Generation [14.951089833579063]
Song Generation task aims to synthesize music composed of vocals and accompaniment from given lyrics.
To mitigate the issue, we introduce an important concept from music composition, namely chords to song generation networks.
We propose a novel model termed Chord-Conditioned Song Generator (CSG) based on it.
arXiv Detail & Related papers (2024-09-10T08:07:43Z) - ComposerX: Multi-Agent Symbolic Music Composition with LLMs [51.68908082829048]
Music composition is a complex task that requires abilities to understand and generate information with long dependency and harmony constraints.
Current LLMs easily fail in this task, generating ill-written music even when equipped with modern techniques like In-Context-Learning and Chain-of-Thoughts.
We propose ComposerX, an agent-based symbolic music generation framework.
arXiv Detail & Related papers (2024-04-28T06:17:42Z) - ByteComposer: a Human-like Melody Composition Method based on Language
Model Agent [11.792129708566598]
Large Language Models (LLM) have shown encouraging progress in multimodal understanding and generation tasks.
We propose ByteComposer, an agent framework emulating a human's creative pipeline in four separate steps.
We conduct extensive experiments on GPT4 and several open-source large language models, which substantiate our framework's effectiveness.
arXiv Detail & Related papers (2024-02-24T04:35:07Z) - DiffMoog: a Differentiable Modular Synthesizer for Sound Matching [48.33168531500444]
DiffMoog is a differentiable modular synthesizer with a comprehensive set of modules typically found in commercial instruments.
Being differentiable, it allows integration into neural networks, enabling automated sound matching.
We introduce an open-source platform that comprises DiffMoog and an end-to-end sound matching framework.
arXiv Detail & Related papers (2024-01-23T08:59:21Z) - MusicAgent: An AI Agent for Music Understanding and Generation with
Large Language Models [54.55063772090821]
MusicAgent integrates numerous music-related tools and an autonomous workflow to address user requirements.
The primary goal of this system is to free users from the intricacies of AI-music tools, enabling them to concentrate on the creative aspect.
arXiv Detail & Related papers (2023-10-18T13:31:10Z) - Symphony Generation with Permutation Invariant Language Model [57.75739773758614]
We present a symbolic symphony music generation solution, SymphonyNet, based on a permutation invariant language model.
A novel transformer decoder architecture is introduced as backbone for modeling extra-long sequences of symphony tokens.
Our empirical results show that our proposed approach can generate coherent, novel, complex and harmonious symphony compared to human composition.
arXiv Detail & Related papers (2022-05-10T13:08:49Z) - RL-Duet: Online Music Accompaniment Generation Using Deep Reinforcement
Learning [69.20460466735852]
This paper presents a deep reinforcement learning algorithm for online accompaniment generation.
The proposed algorithm is able to respond to the human part and generate a melodic, harmonic and diverse machine part.
arXiv Detail & Related papers (2020-02-08T03:53:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.