Fugu-MT 論文翻訳(概要): Training chord recognition models on artificially generated audio

論文の概要: Training chord recognition models on artificially generated audio

arxiv url: http://arxiv.org/abs/2508.05878v1
Date: Thu, 07 Aug 2025 22:01:58 GMT
ステータス: 翻訳完了
システム内更新日: 2025-08-11 20:39:06.020981
Title: Training chord recognition models on artificially generated audio
Title（参考訳）: 人工音声によるコード認識モデルの訓練
Authors: Martyna Majchrzak, Jacek Mańdziuk,
Abstract要約: 本研究では,2つのトランスフォーマーベースニューラルネットワークモデルを用いて,音声録音におけるコードシーケンス認識について比較した。実験により、人工的に生成された音楽と人間の構成音楽の間には複雑さと構造の違いがあるにもかかわらず、前者は特定のシナリオで有用であることが証明された。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: One of the challenging problems in Music Information Retrieval is the acquisition of enough non-copyrighted audio recordings for model training and evaluation. This study compares two Transformer-based neural network models for chord sequence recognition in audio recordings and examines the effectiveness of using an artificially generated dataset for this purpose. The models are trained on various combinations of Artificial Audio Multitracks (AAM), Schubert's Winterreise Dataset, and the McGill Billboard Dataset and evaluated with three metrics: Root, MajMin and Chord Content Metric (CCM). The experiments prove that even though there are certainly differences in complexity and structure between artificially generated and human-composed music, the former can be useful in certain scenarios. Specifically, AAM can enrich a smaller training dataset of music composed by a human or can even be used as a standalone training set for a model that predicts chord sequences in pop music, if no other data is available.
Abstract（参考訳）: 音楽情報検索における課題の1つは、モデルトレーニングと評価のための十分な非コピーライトオーディオレコードの取得である。本研究では,音声録音におけるコードシーケンス認識のためのトランスフォーマーベースニューラルネットワークモデル2つを比較し,この目的のために人工的に生成されたデータセットの有効性を検討する。モデルは、Artificial Audio Multitracks (AAM)、Schubert's Winterreise Dataset、McGill Billboard Datasetの様々な組み合わせでトレーニングされ、Root、MagMin、Cord Content Metric (CCM)の3つのメトリクスで評価される。実験により、人工的に生成した音楽と人間の構成した音楽の間には、明らかに複雑さと構造の違いがあるにもかかわらず、前者は特定のシナリオで有用であることが証明された。具体的には、AAMは人間によって構成されるより小さな音楽のトレーニングデータセットを豊かにしたり、あるいは他のデータが得られない場合、ポップ音楽のコードシーケンスを予測するモデルのためのスタンドアロンのトレーニングセットとして使用することもできる。

論文の概要: Training chord recognition models on artificially generated audio

関連論文リスト