Fugu-MT 論文翻訳(概要): Optical Music Recognition for Real-World Manuscripts with Synthetic Data

論文の概要: Optical Music Recognition for Real-World Manuscripts with Synthetic Data

arxiv url: http://arxiv.org/abs/2606.09479v1
Date: Mon, 08 Jun 2026 13:38:48 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-09 14:42:07.095636
Title: Optical Music Recognition for Real-World Manuscripts with Synthetic Data
Title（参考訳）: 合成データを用いた実世界の写本の光学的音楽認識
Authors: Jiří Mayer, Martina Dvořáková, Vojtěch Dvořák, Markéta Herzánová Vlková, Filip Bím, Pavel Pecina, Samuel Šomorjai, Petr Žabička, Jan Hajič,
Abstract要約: 複雑なピアノ表記を用いた実世界の写本のベースラインを提供する。そこで本研究では, ドメイン内データの直接転写は依然として不可欠であるが, 合成楽譜画像を用いたドメイン適応により, 大幅な改善がもたらされたことを示す。そこで我々は,光学音楽の認識を,音楽文化遺産の保存と促進という目標の1つに近づける。
参考スコア（独自算出の注目度）: 1.3125176461810544
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Optical Music Recognition (OMR) has seen major progress in model design, with end-to-end methods now capable of recognising notation at all levels of complexity. However, the impact of this progress has been limited by the visual domains of available training datasets, which are largely born-digital. Existing large collections of sheet music in libraries and other heritage institutions contain predominantly manuscripts, whose visual domains are highly diverse and different, so existing OMR systems fail when applied in the real world. These institutions are often resource-constrained, so large in-domain datasets cannot be expected. We provide a first baseline on real-world manuscripts with complex piano notation in the resource-constrained scenario. Using fine-grained music notation graph (MuNG) annotations and the Smashcima synthesis tool, we then show that while some direct transcriptions of in-domain data remain essential, domain adaptation using synthetic musical manuscript images brings significant improvement. Furthermore, the symbols used do not need to be in-domain, so the expensive fine-grained annotation can be avoided. We thus bring OMR closer to one of its stated goals: preserving and promoting musical cultural heritage.
Abstract（参考訳）: 光音楽認識(OMR)は、あらゆるレベルの複雑さで表記を認識できるエンドツーエンドの手法によって、モデル設計において大きな進歩を遂げている。しかし、この進歩の影響は、利用可能なトレーニングデータセットの視覚領域によって制限されている。図書館などの遺産機関に現存している楽譜集には、視覚領域が非常に多様で異なっており、既存のOMRシステムは現実世界に適用されると失敗する。これらの機関はリソースに制約されることが多いため、大きなドメイン内のデータセットは期待できない。資源制約のあるシナリオにおいて,複雑なピアノ表記を用いた実世界の写本のベースラインを提供する。微粒な音楽表記グラフ(MuNG)アノテーションとSmashcima合成ツールを用いて、ドメイン内データの直接転写は依然として不可欠であるが、合成楽譜画像を用いたドメイン適応は大幅に改善されていることを示す。さらに、使用するシンボルはドメイン内で必要とせず、高価な微粒なアノテーションを避けることができる。そこで我々は,OMRを音楽文化遺産の保存・振興という目標の1つに近づける。

論文の概要: Optical Music Recognition for Real-World Manuscripts with Synthetic Data

関連論文リスト