MusicSwarm: Biologically Inspired Intelligence for Music Composition
- URL: http://arxiv.org/abs/2509.11973v1
- Date: Mon, 15 Sep 2025 14:23:09 GMT
- Title: MusicSwarm: Biologically Inspired Intelligence for Music Composition
- Authors: Markus J. Buehler,
- Abstract summary: We show that coherent, long-form musical composition can emerge from a decentralized swarm of identical, frozen foundation models.<n>We compare a centralized multi-agent system with a global critic to a fully decentralized swarm in which bar-wise agents sense and deposit harmonic, rhythmic, and structural cues, adapt short-term memory, and reach consensus.
- Score: 1.3537117504260623
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We show that coherent, long-form musical composition can emerge from a decentralized swarm of identical, frozen foundation models that coordinate via stigmergic, peer-to-peer signals, without any weight updates. We compare a centralized multi-agent system with a global critic to a fully decentralized swarm in which bar-wise agents sense and deposit harmonic, rhythmic, and structural cues, adapt short-term memory, and reach consensus. Across symbolic, audio, and graph-theoretic analyses, the swarm yields superior quality while delivering greater diversity and structural variety and leads across creativity metrics. The dynamics contract toward a stable configuration of complementary roles, and self-similarity networks reveal a small-world architecture with efficient long-range connectivity and specialized bridging motifs, clarifying how local novelties consolidate into global musical form. By shifting specialization from parameter updates to interaction rules, shared memory, and dynamic consensus, MusicSwarm provides a compute- and data-efficient route to long-horizon creative structure that is immediately transferable beyond music to collaborative writing, design, and scientific discovery.
Related papers
- NarraScore: Bridging Visual Narrative and Musical Dynamics via Hierarchical Affective Control [59.6128550986024]
NarraScore is a hierarchical framework predicated on the core insight that emotion serves as a high-density compression of narrative logic.<n>NarraScore employs a Dual-Branch Injection strategy to reconcile global structure with local dynamism.<n>NarraScore achieves state-of-the-art consistency and narrative alignment with negligible computational overhead.
arXiv Detail & Related papers (2026-02-09T09:39:42Z) - Selective Imperfection as a Generative Framework for Analysis, Creativity and Discovery [1.3537117504260623]
We show how sound functions as a scientific probe, an inversion where listening becomes a mode of seeing and musical composition becomes a blueprint for matter.<n>We show that science and art are generative acts of world-building under constraint, with vibration as a shared grammar organizing structure across scales.
arXiv Detail & Related papers (2025-12-30T11:14:51Z) - Multi-label Classification with Panoptic Context Aggregation Networks [61.82285737410154]
This paper introduces the Deep Panoptic Context Aggregation Network (PanCAN), a novel approach that hierarchically integrates multi-order geometric contexts.<n>PanCAN learns multi-order neighborhood relationships at each scale by combining random walks with an attention mechanism.<n>Experiments on NUS-WIDE, PASCAL VOC,2007, and MS-COCO benchmarks demonstrate that PanCAN consistently achieves competitive results.
arXiv Detail & Related papers (2025-12-29T14:16:21Z) - The Ghost in the Keys: A Disklavier Demo for Human-AI Musical Co-Creativity [59.78509280246215]
Aria-Duet is an interactive system facilitating a real-time musical duet between a human pianist and Aria, a state-of-the-art generative model.<n>We analyze the system's output from a musicological perspective, finding the model can maintain stylistic semantics and develop coherent phrasal ideas.
arXiv Detail & Related papers (2025-11-03T15:26:01Z) - An Agent-Based Framework for Automated Higher-Voice Harmony Generation [0.0]
Our framework comprises four specialized agents: a Music-Ingestion Agent for parsing and standardizing input musical scores; a Chord-Knowledge Agent, powered by a Chord-Former (Transformer model), to interpret and provide the constituent notes of complex chord symbols; and a Harmony-Generation Agent, which composes a melodically and rhythmically complementary harmony line.<n>By delegating specific tasks to specialized agents, our system effectively mimics the collaborative process of human musicians.
arXiv Detail & Related papers (2025-09-29T08:42:42Z) - TOMI: Transforming and Organizing Music Ideas for Multi-Track Compositions with Full-Song Structure [8.721294663967305]
We introduce TOMI (Transforming and Organizing Music Ideas) as a novel approach in deep music generation.<n>We represent a multi-track composition process via a sparse, four-dimensional space characterized by clips (short audio or MIDI segments), sections (temporal positions), tracks (instrument layers) and transformations.<n>Our model is capable of generating multi-track electronic music with full-song structure, and we further integrate the TOMI-based model with the REAPER digital audio workstation.
arXiv Detail & Related papers (2025-06-29T05:15:41Z) - MotionRAG-Diff: A Retrieval-Augmented Diffusion Framework for Long-Term Music-to-Dance Generation [10.203209816178552]
MotionRAG-Diff is a hybrid framework that integrates Retrieval-Augmented Generation and diffusion-based refinement.<n>Our method introduces three core innovations.<n>It achieves state-of-the-art performance in motion quality, diversity, and music-motion synchronization accuracy.
arXiv Detail & Related papers (2025-06-03T09:12:48Z) - Synthesizing Composite Hierarchical Structure from Symbolic Music Corpora [32.18458296933001]
We propose a unified, hierarchical meta-representation of musical structure called the structural temporal graph (STG)<n>For a single piece, the STG is a data structure that defines a hierarchy of progressively finer structural musical features and the temporal relationships between them.
arXiv Detail & Related papers (2025-02-21T02:32:29Z) - Unifying Symbolic Music Arrangement: Track-Aware Reconstruction and Structured Tokenization [19.27890803128116]
We present a unified framework for automatic multitrack music arrangement.<n>At its core is a segment-level reconstruction objective operating on token-level disentangled content and style.<n>To support track-wise modeling, we introduce REMI-z, a structured tokenization scheme for multitrack symbolic music.
arXiv Detail & Related papers (2024-08-27T16:18:51Z) - Robust Collaborative Perception without External Localization and Clock Devices [52.32342059286222]
A consistent spatial-temporal coordination across multiple agents is fundamental for collaborative perception.
Traditional methods depend on external devices to provide localization and clock signals.
We propose a novel approach: aligning by recognizing the inherent geometric patterns within the perceptual data of various agents.
arXiv Detail & Related papers (2024-05-05T15:20:36Z) - ComposerX: Multi-Agent Symbolic Music Composition with LLMs [51.68908082829048]
Music composition is a complex task that requires abilities to understand and generate information with long dependency and harmony constraints.
Current LLMs easily fail in this task, generating ill-written music even when equipped with modern techniques like In-Context-Learning and Chain-of-Thoughts.
We propose ComposerX, an agent-based symbolic music generation framework.
arXiv Detail & Related papers (2024-04-28T06:17:42Z) - Graph-based Polyphonic Multitrack Music Generation [9.701208207491879]
This paper introduces a novel graph representation for music and a deep Variational Autoencoder that generates the structure and the content of musical graphs separately.
By separating the structure and content of musical graphs, it is possible to condition generation by specifying which instruments are played at certain times.
arXiv Detail & Related papers (2023-07-27T15:18:50Z) - Structure-Enhanced Pop Music Generation via Harmony-Aware Learning [20.06867705303102]
We propose to leverage harmony-aware learning for structure-enhanced pop music generation.
Results of subjective and objective evaluations demonstrate that Harmony-Aware Hierarchical Music Transformer (HAT) significantly improves the quality of generated music.
arXiv Detail & Related papers (2021-09-14T05:04:13Z) - Coordination Among Neural Modules Through a Shared Global Workspace [78.08062292790109]
In cognitive science, a global workspace architecture has been proposed in which functionally specialized components share information.
We show that capacity limitations have a rational basis in that they encourage specialization and compositionality.
arXiv Detail & Related papers (2021-03-01T18:43:48Z) - Sequence Generation using Deep Recurrent Networks and Embeddings: A
study case in music [69.2737664640826]
This paper evaluates different types of memory mechanisms (memory cells) and analyses their performance in the field of music composition.
A set of quantitative metrics is presented to evaluate the performance of the proposed architecture automatically.
arXiv Detail & Related papers (2020-12-02T14:19:19Z) - Modeling Musical Structure with Artificial Neural Networks [0.0]
I explore the application of artificial neural networks to different aspects of musical structure modeling.
I show how a connectionist model, the Gated Autoencoder (GAE), can be employed to learn transformations between musical fragments.
I propose a special predictive training of the GAE, which yields a representation of polyphonic music as a sequence of intervals.
arXiv Detail & Related papers (2020-01-06T18:35:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.