Fugu-MT 論文翻訳(概要): Leveraging Morphology for Historical Script Metrological Analysis

論文の概要: Leveraging Morphology for Historical Script Metrological Analysis

arxiv url: http://arxiv.org/abs/2606.09446v1
Date: Mon, 08 Jun 2026 12:55:02 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-09 14:42:07.072407
Title: Leveraging Morphology for Historical Script Metrological Analysis
Title（参考訳）: 歴史的スクリプトメトロロジー解析のためのレバレッジ形態学
Authors: Malamatenia Vlachou Efstathiou, Raphaël Baena, Dominique Stutzmann, Mathieu Aubry,
Abstract要約: 本稿では,行レベルの転写監督のみで効率の良い文字モデリングを実現する学習手法を提案する。このデモのために、我々は14世紀後半にシャルル5世によって依頼され、4つの手によって複製された、パリ写本『BnF, fr. 2813』の注釈を拡張した。
参考スコア（独自算出の注目度）: 12.924056486436415
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Advances in handwritten text recognition have enabled large-scale transcription of historical documents, but still provide limited access to interpretable visual measurements for paleography, the study of historical scripts. In this paper, our main insight is that morphological script analysis, in particular the capacity to learn character prototypes from line-level transcriptions, enables the definition of scalable, meaningful, and stable paleographic measurements. More precisely, we leverage a transformer-based detection architecture together with a prototype-based line reconstruction module to learn prototypical characters and their occurrence, deformation, and positioning. Our contributions are twofold. First, we introduce a deep architecture and learning methodology that enables efficient character modeling with only line-level transcription supervision, significantly improving over the Learnable Typewriter baseline and enabling accurate character bounding box prediction, unlocking its potential for paleographic measurements. Second, we introduce and demonstrate the paleographical relevance of automatic measurements enabled by our architecture for characters, bi-grams, and spaces between graphical units. For this demonstration, we extend the annotations of the codex Paris, BnF, fr. 2813, commissioned in the late fourteenth century by Charles V and copied by four hands, to 160 pages. We visualize our measurements over these pages, showing how they enable us not only to differentiate graphical profiles, but also to discover and analyze subtle variations. This case study outlines the scalability of our approach and its frugality in terms of required training data, since a single column of text is sufficient to compute our measurements on each of the 160 pages. Data and code are publicly available at: https://malamatenia.github.io/morphology4metrology-analysis.
Abstract（参考訳）: 手書き文字認識の進歩により、歴史文書の大規模な書き起こしが可能になったが、歴史書の研究である古文書学の解釈可能な視覚計測へのアクセスは限られている。本稿では,形態的スクリプト解析,特にラインレベルの文字起こしから文字のプロトタイプを学習する能力によって,スケーラブルで有意義で安定した古文書計測の定義が可能になることを考察する。より正確には、原型文字とその発生、変形、位置決めを学習するために、トランスフォーマに基づく検出アーキテクチャとプロトタイプベースのライン再構築モジュールを併用する。私たちの貢献は2倍です。まず,行レベルの転写監督のみで効率的な文字モデリングを実現し,Learningable Typewriterベースラインを大幅に改善し,文字境界ボックスの正確な予測を可能にした。第2に,図形単位間の文字,バイグラム,空間に対するアーキテクチャによって実現された自動計測の古的関連性について紹介し,実演する。このデモでは、codex Paris, BnF, frのアノテーションを拡張します。 2813年、チャールズ5世によって14世紀後半に依頼され、四手書きで160ページに複製された。これらのページ上で計測結果を可視化し、グラフィカルなプロファイルを識別するだけでなく、微妙なバリエーションを発見し解析する方法について示す。このケーススタディでは,160ページ毎の計測値を計算するのに十分な1列のテキストが十分であるため,必要なトレーニングデータの観点から,我々のアプローチのスケーラビリティと柔軟性を概説する。データとコードは、https://malamatenia.github.io/morphology4metrology-analysis.comで公開されている。

論文の概要: Leveraging Morphology for Historical Script Metrological Analysis

関連論文リスト