Fugu-MT 論文翻訳(概要): Contrastive-to-Self-Supervised: A Two-Stage Framework for Script Similarity Learning

論文の概要: Contrastive-to-Self-Supervised: A Two-Stage Framework for Script Similarity Learning

arxiv url: http://arxiv.org/abs/2603.06180v1
Date: Fri, 06 Mar 2026 11:39:20 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-09 13:17:45.57976
Title: Contrastive-to-Self-Supervised: A Two-Stage Framework for Script Similarity Learning
Title（参考訳）: Contrastive-to-Self-Supervised: スクリプト類似学習のための2段階フレームワーク
Authors: Claire Roman, Philippe Meyer,
Abstract要約: 我々は、ラベル付きアルファベットに対して対照的な損失を伴うエンコーダを訓練し、堅牢な差別的特徴を持つ教師を確立する。教師は教師の知識で指導された教師なしの表現を学習するが、潜伏したクロススクリプトの類似性は見つからない。我々のアプローチブリッジは対照的な学習と教師なしの発見を監督し、異なるシステム間のハードな境界と、潜在的な歴史的影響を反映したソフトな類似性の両方を可能にします。
参考スコア（独自算出の注目度）: 0.45835414225547183
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Learning similarity metrics for glyphs and writing systems faces a fundamental challenge: while individual graphemes within invented alphabets can be reliably labeled, the historical relationships between different scripts remain uncertain and contested. We propose a two-stage framework that addresses this epistemological constraint. First, we train an encoder with contrastive loss on labeled invented alphabets, establishing a teacher model with robust discriminative features. Second, we extend to historically attested scripts through teacher-student distillation, where the student learns unsupervised representations guided by the teacher's knowledge but free to discover latent cross-script similarities. The asymmetric setup enables the student to learn deformation-invariant embeddings while inheriting discriminative structure from clean examples. Our approach bridges supervised contrastive learning and unsupervised discovery, enabling both hard boundaries between distinct systems and soft similarities reflecting potential historical influences. Experiments on diverse writing systems demonstrate effective few-shot glyph recognition and meaningful script clustering without requiring ground-truth evolutionary relationships.
Abstract（参考訳）: グリフと書記システムの類似度を学習することは、基本的な課題に直面している:発明されたアルファベットの中の個々のグラフエムを確実にラベル付けできるが、異なるスクリプト間の歴史的関係は不確実であり、議論されている。この認識論的制約に対処する2段階のフレームワークを提案する。まず、ラベル付き発明されたアルファベットに対して対照的な損失を伴うエンコーダを訓練し、堅牢な識別特性を持つ教師モデルを確立する。第2に,教師が教師の知識で指導された教師なしの表現を学習するが,潜伏したクロススクリプトの類似性は見つからない。非対称な設定により、学生はクリーンな例から識別的構造を継承しながら変形不変な埋め込みを学習することができる。我々のアプローチブリッジは対照的な学習と教師なしの発見を監督し、異なるシステム間のハードな境界と、潜在的な歴史的影響を反映したソフトな類似性の両方を可能にします。多様な筆記システムに関する実験は、基礎と真実の進化的関係を必要とせず、効果的な数発のグリフ認識と意味のあるスクリプトクラスタリングを実証している。

論文の概要: Contrastive-to-Self-Supervised: A Two-Stage Framework for Script Similarity Learning

関連論文リスト