Fugu-MT 論文翻訳(概要): Causal Masking on Spatial Data: An Information-Theoretic Case for Learning Spatial Datasets with Unimodal Language Models

論文の概要: Causal Masking on Spatial Data: An Information-Theoretic Case for Learning Spatial Datasets with Unimodal Language Models

arxiv url: http://arxiv.org/abs/2510.27009v1
Date: Thu, 30 Oct 2025 21:22:36 GMT
ステータス: 翻訳完了
システム内更新日: 2025-11-03 17:52:15.917176
Title: Causal Masking on Spatial Data: An Information-Theoretic Case for Learning Spatial Datasets with Unimodal Language Models
Title（参考訳）: 空間データに基づく因果マスキング:一様言語モデルを用いた空間データセット学習のための情報理論ケース
Authors: Jared Junkin, Samuel Nathanson,
Abstract要約: 我々は、空間データとシーケンシャルデータの両方で双方向および因果自己認識機構を持つ言語モデルを訓練する。以上の結果から,空間板上でトレーニングしたモデル – 因果マスキングを施したテキスト – は連続データでトレーニングしたモデルよりも強い演奏強度を連続的に達成できることがわかった。
参考スコア（独自算出の注目度）: 0.6015898117103068
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Language models are traditionally designed around causal masking. In domains with spatial or relational structure, causal masking is often viewed as inappropriate, and sequential linearizations are instead used. Yet the question of whether it is viable to accept the information loss introduced by causal masking on nonsequential data has received little direct study, in part because few domains offer both spatial and sequential representations of the same dataset. In this work, we investigate this issue in the domain of chess, which naturally supports both representations. We train language models with bidirectional and causal self-attention mechanisms on both spatial (board-based) and sequential (move-based) data. Our results show that models trained on spatial board states - \textit{even with causal masking} - consistently achieve stronger playing strength than models trained on sequential data. While our experiments are conducted on chess, our results are methodological and may have broader implications: applying causal masking to spatial data is a viable procedure for training unimodal LLMs on spatial data, and in some domains is even preferable to sequentialization.
Abstract（参考訳）: 言語モデルは伝統的に因果マスクを中心に設計されている。空間的あるいは関係的な構造を持つ領域では、因果マスクはしばしば不適切と見なされ、代わりに逐次線形化が用いられる。しかし、非逐次データに対する因果マスキングによって引き起こされる情報損失を受け入れることができるかどうかという問題は直接的に研究されることがほとんどなく、また、同じデータセットの空間的およびシーケンシャルな表現を提供するドメインがほとんどないためである。本研究では,両表現を自然にサポートするチェスの領域におけるこの問題について検討する。我々は、空間的(ボードベース)データと逐次的(移動ベース)データの両方に基づいて、双方向および因果自己認識機構を持つ言語モデルを訓練する。以上の結果から,空間基板上でトレーニングしたモデル – 因果マスキングを用いた‘textit{even’ – は連続データでトレーニングしたモデルよりも強い演奏強度を連続的に達成できることがわかった。空間データに因果マスキングを適用することは、空間データに一助的LSMを訓練するための実行可能な手順であり、いくつかの領域ではシーケンシャル化よりも好ましい。

論文の概要: Causal Masking on Spatial Data: An Information-Theoretic Case for Learning Spatial Datasets with Unimodal Language Models

関連論文リスト