Fugu-MT 論文翻訳(概要): How Far Can Chord-Symbol Time-Series Adaptation Carry Genre Identity? Capabilities and Boundaries in Multi-Genre Chord-Symbol Modeling

論文の概要: How Far Can Chord-Symbol Time-Series Adaptation Carry Genre Identity? Capabilities and Boundaries in Multi-Genre Chord-Symbol Modeling

arxiv url: http://arxiv.org/abs/2606.07334v1
Date: Fri, 05 Jun 2026 14:49:24 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-08 14:33:29.791868
Title: How Far Can Chord-Symbol Time-Series Adaptation Carry Genre Identity? Capabilities and Boundaries in Multi-Genre Chord-Symbol Modeling
Title（参考訳）: 弦-弦-弦-弦-弦-弦-弦-弦のモデリングにおいて, 弦-弦-弦-弦-弦-弦-弦-弦-弦-弦-弦-弦-弦-弦-弦-弦-弦-弦-弦-弦-弦-弦-弦-弦-弦-弦-弦-弦-弦-弦-弦-弦-弦-弦-弦-弦-弦-弦-弦-弦-弦-弦-弦-弦-弦-弦-弦-弦-弦-弦-弦-弦-弦-弦-
Authors: Jinju Lee,
Abstract要約: 本報告では、弦記号列を音楽の完全表現ではなく、ジャンル局所調和モデルのための制御可能な時系列として扱う。主な評価は、LoRA、IA3、BitFit、プレフィックスチューニング、11のジャンルと3つのシードの完全な微調整、完全な165セルグリッドである。コード記号適応は、ジャンルの局所的な調和予測を確実に改善するが、コード記号だけでは完全なジャンルの同一性は持たない。
参考スコア（独自算出の注目度）: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Harmony is a compact symbolic layer where mathematical pitch relations, acoustic consonance, and musical convention meet. This report treats chord-symbol sequences not as a complete representation of music, but as an interpretable, controllable time series for genre-local harmonic modeling. Starting from a frozen pop-jazz Music Transformer checkpoint, I evaluate how far small adaptation interfaces can extend the model to eleven target genres: blues, bossa nova, Bach chorales, country, electronic, folk, funk, gospel, hip-hop, R&B/soul, and rock. The main evaluation compares LoRA, IA3, BitFit, prefix tuning, and full fine-tuning over 11 genres and 3 seeds, a complete 165-cell grid. All five methods improve over the frozen base on held-out chord prediction, with macro gains from +2.89 to +3.61 points; LoRA and IA3 score highest, but Wilcoxon tests with Holm and Benjamini-Hochberg correction do not support a decisive winner. A matched-data-size control sharpens this: when genres are sub-sampled to a common corpus size, IA3 stays on top but LoRA's full-data edge disappears and it falls to last, indicating the small gaps are partly data-driven. A control-token baseline is also strong, and wrong-genre adapters often beat the frozen base, suggesting much of the effect comes from lightweight conditioning over a reusable harmonic base rather than one particular adapter family. Additional diagnostics (rank sweeps, wrong-genre rotation, a base-checkpoint ablation, chord-only genre classification, generated-output statistics, real-song evaluation, and duplicate analysis) support a bounded conclusion: chord-symbol adaptation reliably improves genre-local harmonic prediction, but chord symbols alone do not carry complete genre identity. The report therefore avoids claims about perceived genre authenticity or full musical quality, which require controlled listener or musician evaluation.
Abstract（参考訳）: ハーモニー(Harmony)は、数学的なピッチ関係、音響共鳴、音楽コンベンションが交わるコンパクトな記号層である。本報告では、弦記号列を音楽の完全表現ではなく、ジャンル局所調和モデリングのための解釈可能な、制御可能な時系列として扱う。フリーズされたポップ・ジャズ・ミュージック・トランスフォーマー・チェックポイントから、小さなアダプティブ・インタフェースが、ブルース、ボカ・ノヴァ、バッハ・コーラル、カントリー、エレクトロニック、フォーク、ゴスペル、ヒップホップ、R&B/ソウル、ロックの11のジャンルにモデルを拡張できるかを評価する。主な評価は、LoRA、IA3、BitFit、プレフィックスチューニング、11のジャンルと3つのシードの完全な微調整、完全な165セルグリッドだ。 5つの手法はすべて、保持されたコード予測の凍結ベースよりも改善され、マクロゲインは+2.89から+3.61ポイントとなり、ロラとIA3のスコアが最も高いが、ホルムとベンジャミン・ホックバーグの補正によるウィルコクソンのテストは決定的な勝者を支持していない。ジャンルが共通のコーパスサイズにサブサンプリングされた場合、IA3はトップに留まるが、LoRAのフルデータエッジは消え、最後に落ちるため、小さなギャップが部分的にデータ駆動であることを示している。コントロール・トーケンのベースラインも強力で、間違ったジャンルのアダプタが凍ったベースにしばしば打ち勝つことがあり、その影響の大部分は、特定のアダプタファミリーではなく、再利用可能なハーモニックベースに対する軽量な条件付けによるものであることを示唆している。追加の診断(ランクスイープ、間違ったジャンルの回転、ベースチェックポイント・アブレーション、コードのみのジャンル分類、生成出力統計、実声評価、重複分析)は、境界づけられた結論を支持する: コード記号適応はジャンル局所の調和予測を確実に改善するが、コード記号だけでは完全なジャンル識別は持たない。したがって、このレポートは、ジャンルの正当性や完全な音楽的品質に対する主張を回避し、リスナーやミュージシャンの評価を制御する必要がある。

論文の概要: How Far Can Chord-Symbol Time-Series Adaptation Carry Genre Identity? Capabilities and Boundaries in Multi-Genre Chord-Symbol Modeling

関連論文リスト