Fugu-MT 論文翻訳(概要): Zero-Shot Imagined Speech Decoding via Imagined-to-Listened MEG Mapping

論文の概要: Zero-Shot Imagined Speech Decoding via Imagined-to-Listened MEG Mapping

arxiv url: http://arxiv.org/abs/2605.08075v1
Date: Fri, 08 May 2026 17:56:19 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-11 19:43:39.260868
Title: Zero-Shot Imagined Speech Decoding via Imagined-to-Listened MEG Mapping
Title（参考訳）: 擬似MEGマッピングによるゼロショット音声復号
Authors: Maryam Maghsoudi, Shihab Shamma,
Abstract要約: 非侵襲的な脳記録から想像された音声を復号することは困難である。本稿では,音声聴取時に,よりリッチで信頼性の高いラベル付き録音を活用する,想像音声の復号化のための新しい手法を提案する。
参考スコア（独自算出の注目度）: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Decoding imagined speech from non-invasive brain recordings is challenging because imagined datasets are scarce and difficult to align temporally across subjects and sessions In this work, we propose a new approach to the decoding of imagined speech that leverages the richer and more reliably labeled recordings during listening to speech. We collected paired listened and imagined MEG recordings to rhythmic melodic and spoken stimuli from trained musicians. Using trained musicians helped improve temporal alignment across conditions. We then developed a three-stage decoding pipeline that revealed consistent and meaningful relationships between neural activity evoked by imagining and listening to the same stimuli. First, we trained six linear and neural models to map imagined MEG responses to listened responses. We evaluated these models against a null baseline from unseen subjects to validate that the predicted-listening responses preserve stimulus-specific information. In the second stage, we trained a contrastive word decoder exclusively on the listened MEG responses, and evaluated it using four embedding strategies including semantic, acoustic, and phonetic representations. In the third stage, we process the imagined MEG responses from held-out subjects through the mapping pipeline to compute the corresponding listening responses that are then decoded by the listened decoder. Using rank-based analysis, we show that the imagined words are decodable significantly above chance. We shall report here the results of a proof-of-concept implementation to decode imagined speech, where all evaluations are performed on held-out subjects. We also demonstrate that performance improves with training data size, suggesting that this approach is scalable and can directly be made applicable to realistic brain-computer interface scenarios.
Abstract（参考訳）: 非侵襲的な脳記録から想像された音声を復号することは困難である。それは、予測されたデータセットが主題やセッション間で時間的に整合しにくいためである。本研究では、音声聴取中によりリッチで確実にラベル付けされた音声を活用できる、想像された音声の復号化への新たなアプローチを提案する。我々は、リズミカルなメロディックと、訓練されたミュージシャンの発声刺激に対する聴力と心電図の記録をペアで収集した。訓練を受けたミュージシャンは、条件をまたいだ時間的アライメントを改善するのに役立った。次に、同じ刺激を想像して聴くことによって誘発される神経活動の一貫性と意味のある関係を明らかにする3段階の復号パイプラインを開発した。まず、6つの線形およびニューラルモデルを訓練し、想像上のMEG応答を聴取応答にマッピングした。我々は,これらのモデルについて,未知の被験者からのヌルベースラインに対して評価し,予測されたリストング応答が刺激特異的情報を保存することを検証した。第2段階では、聴取されたMEG応答のみに基づいてコントラッシブな単語デコーダを訓練し、セマンティック、アコースティック、音声表現を含む4つの埋め込み戦略を用いて評価した。第3段階では、保持対象からのMEG応答をマッピングパイプラインで処理し、対応するリスニング応答を計算し、リスニングデコーダによってデコードされる。ランクに基づく分析により、想定された単語は確率よりかなり上回っていることを示す。本報告では, 概念実証による音声の復号化の成果を報告する。また、トレーニングデータサイズによってパフォーマンスが向上することが示されており、このアプローチはスケーラブルであり、現実的な脳-コンピュータインタフェースのシナリオに直接適用可能であることを示唆している。

論文の概要: Zero-Shot Imagined Speech Decoding via Imagined-to-Listened MEG Mapping

関連論文リスト