Related papers: MusicAIR: A Multimodal AI Music Generation Framework Powered by an Algorithm-Driven Core

MusicAIR: A Multimodal AI Music Generation Framework Powered by an Algorithm-Driven Core

URL: http://arxiv.org/abs/2511.17323v1
Date: Fri, 21 Nov 2025 15:43:27 GMT
Title: MusicAIR: A Multimodal AI Music Generation Framework Powered by an Algorithm-Driven Core
Authors: Callie C. Liao, Duoduo Liao, Ellie L. Zhang,
Abstract summary: MusicAIR is an innovative AI music generation framework powered by a novel algorithm-driven symbolic music core.<n>The framework generates a complete melodic score solely from the lyrics.<n>GenAIM is a web tool using MusicAIR for lyric-to-song, text-to-music, and image-to-music generation.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent advances in generative AI have made music generation a prominent research focus. However, many neural-based models rely on large datasets, raising concerns about copyright infringement and high-performance costs. In contrast, we propose MusicAIR, an innovative multimodal AI music generation framework powered by a novel algorithm-driven symbolic music core, effectively mitigating copyright infringement risks. The music core algorithms connect critical lyrical and rhythmic information to automatically derive musical features, creating a complete, coherent melodic score solely from the lyrics. The MusicAIR framework facilitates music generation from lyrics, text, and images. The generated score adheres to established principles of music theory, lyrical structure, and rhythmic conventions. We developed Generate AI Music (GenAIM), a web tool using MusicAIR for lyric-to-song, text-to-music, and image-to-music generation. In our experiments, we evaluated AI-generated music scores produced by the system using both standard music metrics and innovative analysis that compares these compositions with original works. The system achieves an average key confidence of 85%, outperforming human composers at 79%, and aligns closely with established music theory standards, demonstrating its ability to generate diverse, human-like compositions. As a co-pilot tool, GenAIM can serve as a reliable music composition assistant and a possible educational composition tutor while simultaneously lowering the entry barrier for all aspiring musicians, which is innovative and significantly contributes to AI for music generation.

Related papers

The Ghost in the Keys: A Disklavier Demo for Human-AI Musical Co-Creativity [59.78509280246215]
Aria-Duet is an interactive system facilitating a real-time musical duet between a human pianist and Aria, a state-of-the-art generative model.<n>We analyze the system's output from a musicological perspective, finding the model can maintain stylistic semantics and develop coherent phrasal ideas.
arXiv Detail & Related papers (2025-11-03T15:26:01Z)
Detecting Musical Deepfakes [0.0]
This study investigates the detection of AI-generated songs using the FakeMusicCaps dataset.<n>To simulate real-world adversarial conditions, tempo stretching and pitch shifting were applied to the dataset.<n>Mel spectrograms were generated from the modified audio, then used to train and evaluate a convolutional neural network.
arXiv Detail & Related papers (2025-05-03T21:45:13Z)
MuDiT & MuSiT: Alignment with Colloquial Expression in Description-to-Song Generation [18.181382408551574]
We propose a novel task of Colloquial Description-to-Song Generation. It focuses on aligning the generated content with colloquial human expressions. This task is aimed at bridging the gap between colloquial language understanding and auditory expression within an AI model.
arXiv Detail & Related papers (2024-07-03T15:12:36Z)
ComposerX: Multi-Agent Symbolic Music Composition with LLMs [51.68908082829048]
Music composition is a complex task that requires abilities to understand and generate information with long dependency and harmony constraints. Current LLMs easily fail in this task, generating ill-written music even when equipped with modern techniques like In-Context-Learning and Chain-of-Thoughts. We propose ComposerX, an agent-based symbolic music generation framework.
arXiv Detail & Related papers (2024-04-28T06:17:42Z)
Arrange, Inpaint, and Refine: Steerable Long-term Music Audio Generation and Editing via Content-based Controls [6.176747724853209]
Large Language Models (LLMs) have shown promise in generating high-quality music, but their focus on autoregressive generation limits their utility in music editing tasks. We propose a novel approach leveraging a parameter-efficient heterogeneous adapter combined with a masking training scheme. Our method integrates frame-level content-based controls, facilitating track-conditioned music refinement and score-conditioned music arrangement.
arXiv Detail & Related papers (2024-02-14T19:00:01Z)
MARBLE: Music Audio Representation Benchmark for Universal Evaluation [79.25065218663458]
We introduce the Music Audio Representation Benchmark for universaL Evaluation, termed MARBLE. It aims to provide a benchmark for various Music Information Retrieval (MIR) tasks by defining a comprehensive taxonomy with four hierarchy levels, including acoustic, performance, score, and high-level description. We then establish a unified protocol based on 14 tasks on 8 public-available datasets, providing a fair and standard assessment of representations of all open-sourced pre-trained models developed on music recordings as baselines.
arXiv Detail & Related papers (2023-06-18T12:56:46Z)
Simple and Controllable Music Generation [94.61958781346176]
MusicGen is a single Language Model (LM) that operates over several streams of compressed discrete music representation, i.e., tokens. Unlike prior work, MusicGen is comprised of a single-stage transformer LM together with efficient token interleaving patterns.
arXiv Detail & Related papers (2023-06-08T15:31:05Z)
A Review of Intelligent Music Generation Systems [4.287960539882345]
ChatGPT has significantly reduced the barrier to entry for non-professionals in creative endeavors. Modern generative algorithms can extract patterns implicit in a piece of music based on rule constraints or a musical corpus.
arXiv Detail & Related papers (2022-11-16T13:43:16Z)
Quantized GAN for Complex Music Generation from Dance Videos [48.196705493763986]
We present Dance2Music-GAN (D2M-GAN), a novel adversarial multi-modal framework that generates musical samples conditioned on dance videos. Our proposed framework takes dance video frames and human body motion as input, and learns to generate music samples that plausibly accompany the corresponding input.
arXiv Detail & Related papers (2022-04-01T17:53:39Z)
Music Harmony Generation, through Deep Learning and Using a Multi-Objective Evolutionary Algorithm [0.0]
This paper introduces a genetic multi-objective evolutionary optimization algorithm for the generation of polyphonic music. One of the goals is the rules and regulations of music, which, along with the other two goals, including the scores of music experts and ordinary listeners, fits the cycle of evolution to get the most optimal response. The results show that the proposed method is able to generate difficult and pleasant pieces with desired styles and lengths, along with harmonic sounds that follow the grammar while attracting the listener, at the same time.
arXiv Detail & Related papers (2021-02-16T05:05:54Z)
Research on AI Composition Recognition Based on Music Rules [7.699648754969773]
Article constructs a music-rule-identifying algorithm through extracting modes. It will identify the stability of the mode of machine-generated music to judge whether it is artificial intelligent.
arXiv Detail & Related papers (2020-10-15T14:51:24Z)
RL-Duet: Online Music Accompaniment Generation Using Deep Reinforcement Learning [69.20460466735852]
This paper presents a deep reinforcement learning algorithm for online accompaniment generation. The proposed algorithm is able to respond to the human part and generate a melodic, harmonic and diverse machine part.
arXiv Detail & Related papers (2020-02-08T03:53:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.