Fugu-MT 論文翻訳(概要): What Can String Probability Tell Us About Grammaticality?

論文の概要: What Can String Probability Tell Us About Grammaticality?

arxiv url: http://arxiv.org/abs/2510.16227v1
Date: Fri, 17 Oct 2025 21:36:00 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-25 00:56:38.906638
Title: What Can String Probability Tell Us About Grammaticality?
Title（参考訳）: 文字列の確率は文法について何を教えてくれるか?
Authors: Jennifer Hu, Ethan Gotlieb Wilcox, Siyuan Song, Kyle Mahowald, Roger P. Levy,
Abstract要約: コーパスデータの生成過程に関する簡単な仮定に基づいて,文法,意味,文字列の確率の関係を理論的に解析する。本フレームワークでは,280K文対を英語と中国語で実証的に評価する3つの予測を行う。本分析は,LMの構造的知識を学習する確率を用いた理論的根拠を与えるとともに,LM文法評価における今後の研究の方向性を提案する。
参考スコア（独自算出の注目度）: 11.216210921392637
License: http://creativecommons.org/licenses/by/4.0/
Abstract: What have language models (LMs) learned about grammar? This question remains hotly debated, with major ramifications for linguistic theory. However, since probability and grammaticality are distinct notions in linguistics, it is not obvious what string probabilities can reveal about an LM's underlying grammatical knowledge. We present a theoretical analysis of the relationship between grammar, meaning, and string probability, based on simple assumptions about the generative process of corpus data. Our framework makes three predictions, which we validate empirically using 280K sentence pairs in English and Chinese: (1) correlation between the probability of strings within minimal pairs, i.e., string pairs with minimal semantic differences; (2) correlation between models' and humans' deltas within minimal pairs; and (3) poor separation in probability space between unpaired grammatical and ungrammatical strings. Our analyses give theoretical grounding for using probability to learn about LMs' structural knowledge, and suggest directions for future work in LM grammatical evaluation.
Abstract（参考訳）: 言語モデル(LM)は文法について何を学んだか? この問題は現在も熱い議論が続き、言語理論に大きな影響が及んでいる。しかし、確率と文法性は言語学において異なる概念であるため、LMの基本的な文法的知識についてどのような文字列確率が明らかになるかは明らかではない。コーパスデータの生成過程に関する簡単な仮定に基づいて,文法,意味,文字列の確率の関係を理論的に解析する。本フレームワークでは,280K文対を英語と中国語で実証的に検証し,(1)最小対における文字列の確率の相関,(2)最小対におけるモデルと人間のデルタの相関,(3)未対の文法的文字列と非文法的文字列の確率空間の分離,という3つの予測を行った。本分析は,LMの構造的知識を学習する確率を用いた理論的根拠を与えるとともに,LM文法評価における今後の研究の方向性を提案する。

論文の概要: What Can String Probability Tell Us About Grammaticality?

関連論文リスト