Fugu-MT 論文翻訳(概要): Reversal Invariance in Autoregressive Language Models

論文の概要: Reversal Invariance in Autoregressive Language Models

arxiv url: http://arxiv.org/abs/2511.00341v1
Date: Sat, 01 Nov 2025 00:51:46 GMT
ステータス: 翻訳完了
システム内更新日: 2025-11-05 16:37:26.72608
Title: Reversal Invariance in Autoregressive Language Models
Title（参考訳）: 自己回帰型言語モデルにおける逆不変性
Authors: Mihir Sahasrabudhe,
Abstract要約: 次トーケン予測損失は、コーパスとその反転に同一の確率を割り当て、標準CLM事前学習が方向盲であることを示唆する。この対称性は、人間の言語と推論の本質的に時間非対称性の性質にもかかわらず、逆テキストで訓練されたモデルがフォワードテキストで訓練されたモデルと同等のパフォーマンスを達成できる理由を説明する。我々は、時間的非対称性のレンズを通して事前学習を行い、標準言語モデリング能力を維持しつつ、言語矢印を明示的にモデル化する損失関数やアーキテクチャに関する将来の研究を動機付けることを提案する。
参考スコア（独自算出の注目度）: 0.15229257192293197
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We formalize a structural property of the causal (autoregressive) language modeling (CLM) objective: reversal invariance. Formally, the next-token prediction loss assigns identical likelihood to a corpus and its reversal, implying that standard CLM pretraining is direction-blind. This symmetry explains why models trained on reversed text can achieve comparable performance to those trained on forward text, despite the inherently time-asymmetric nature of human language and reasoning. We argue that this invariance represents a limitation of current pretraining objectives rather than a benign artifact. If natural language encodes directional dependencies - phonological, morphological, or causal - a symmetric objective may fail to capture them. We therefore propose viewing pretraining through the lens of temporal asymmetry, motivating future work on loss functions and architectures that explicitly model the arrow of language while retaining standard language modeling capacity.
Abstract（参考訳）: 本稿では,因果的(自己回帰的)言語モデリング(CLM)の目的であるリバーサル不変性(reversal invariance)を定式化する。正式には、次のトーケン予測損失は、コーパスとその反転に同一の正当性を割り当て、標準のCLM事前学習が方向盲であることを示唆する。この対称性は、人間の言語と推論の本質的に時間非対称性の性質にもかかわらず、逆テキストで訓練されたモデルがフォワードテキストで訓練されたモデルと同等のパフォーマンスを達成できる理由を説明する。この不変性は、良質な人工物ではなく、現在の事前学習対象の制限を表していると我々は主張する。自然言語が音韻的、形態的、因果的といった方向依存を符号化している場合、対称目的語はそれらを取得するのに失敗する。そこで我々は、時間的非対称性のレンズを通して事前学習を行うことを提案し、標準言語モデリング能力を維持しつつ、言語矢印を明示的にモデル化する損失関数やアーキテクチャに関する将来の研究を動機付けている。

論文の概要: Reversal Invariance in Autoregressive Language Models

関連論文リスト