Fugu-MT 論文翻訳(概要): Towards an AI Musician: Synthesizing Sheet Music Problems for Musical Reasoning

論文の概要: Towards an AI Musician: Synthesizing Sheet Music Problems for Musical Reasoning

arxiv url: http://arxiv.org/abs/2509.04059v2
Date: Fri, 26 Sep 2025 05:37:24 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-29 16:35:18.983292
Title: Towards an AI Musician: Synthesizing Sheet Music Problems for Musical Reasoning
Title（参考訳）: AIミュージシャンを目指して : 楽音推論のための楽譜問題の合成
Authors: Zhilin Wang, Zhe Yang, Yun Luo, Yafu Li, Xiaoye Qu, Ziqian Qiao, Haoran Zhang, Runzhe Zhan, Derek F. Wong, Jizhe Zhou, Yu Cheng,
Abstract要約: 本稿では,ビートやインターバルをプログラム関数として扱うような,コア音楽理論のルールを扱う新しいアプローチを提案する。このアプローチは、テキストと視覚の両方で検証可能な楽譜の質問を生成する。 SSMR-Benchの評価結果は,楽譜の解釈における重要な役割を浮き彫りにした。
参考スコア（独自算出の注目度）: 69.78158549955384
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Enhancing the ability of Large Language Models (LLMs) and Multimodal Large Language Models (MLLMs) to interpret sheet music is a crucial step toward building AI musicians. However, current research lacks both evaluation benchmarks and training data for sheet music reasoning. Inspired by mathematics, where simple operations yield infinite verifiable problems, we introduce a novel approach that treats core music theory rules, such as those governing beats and intervals, as programmatic functions to systematically synthesize a vast and diverse corpus of sheet music reasoning problems. This approach allows us to introduce a data synthesis framework that generates verifiable sheet music questions in both textual and visual modalities, leading to the Synthetic Sheet Music Reasoning Benchmark (SSMR-Bench) and a complementary training set. Evaluation results on SSMR-Bench highlight the key role reasoning plays in interpreting sheet music, while also pointing out the ongoing challenges in understanding sheet music in a visual format. By leveraging synthetic data for RLVR, all models show significant improvements on the SSMR-Bench. Additionally, they also demonstrate considerable advancements on previously established human-crafted benchmarks, such as MusicTheoryBench and the music subset of MMMU. Finally, our results show that the enhanced reasoning ability can also facilitate music composition.
Abstract（参考訳）: 楽譜を解釈する大規模言語モデル(LLM)とマルチモーダル大規模言語モデル(MLLM)の能力を高めることは、AIミュージシャンを構築するための重要なステップである。しかし,本研究では,楽譜推論のための評価ベンチマークとトレーニングデータの両方が欠落している。単純な操作が無限に検証可能な問題を生み出す数学に触発されて、我々は、楽譜と間隔を管理するような中核的な音楽理論ルールを、広範囲で多様な楽譜推論問題のコーパスを体系的に合成するプログラム機能として扱う新しいアプローチを導入する。提案手法では,テキストと視覚の両方で検証可能なシート音楽質問を生成するデータ合成フレームワークを導入し,SSMR-Bench(Synthetic Sheet Music Reasoning Benchmark)と補完的なトレーニングセットを実現する。 SSMR-Benchの評価結果は,楽譜の解釈において重要な役割を担いながら,楽譜を視覚的に理解する上での課題も指摘している。 RLVRの合成データを活用することで、すべてのモデルがSSMR-Benchで大幅に改善されている。さらに、MusicTheoryBenchやMMMUの音楽サブセットなど、以前に確立された人為的なベンチマークにもかなりの進歩が見られる。最後に,本研究の結果から,聴取能力の向上が音楽の作曲を促進できることが示唆された。

論文の概要: Towards an AI Musician: Synthesizing Sheet Music Problems for Musical Reasoning

関連論文リスト