Fugu-MT 論文翻訳(概要): Biasless Language Models Learn Unnaturally: How LLMs Fail to Distinguish the Possible from the Impossible

論文の概要: Biasless Language Models Learn Unnaturally: How LLMs Fail to Distinguish the Possible from the Impossible

arxiv url: http://arxiv.org/abs/2510.07178v1
Date: Wed, 08 Oct 2025 16:17:13 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-09 16:41:20.61699
Title: Biasless Language Models Learn Unnaturally: How LLMs Fail to Distinguish the Possible from the Impossible
Title（参考訳）: ビザレス言語モデルは非自然に学習する: LLMはいかにして不可能な言語を区別できないか
Authors: Imry Ziv, Nur Lan, Emmanuel Chemla, Roni Katzir,
Abstract要約: GPT-2は各言語と不可能な言語を等しく学習する。パープレキシティ曲線上で計算された様々な指標の言語間差異を考慮することにより、GPT-2は可能と不可能を体系的に分離することができないことを示す。
参考スコア（独自算出の注目度）: 4.7831562043724665
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Are large language models (LLMs) sensitive to the distinction between humanly possible languages and humanly impossible languages? This question is taken by many to bear on whether LLMs and humans share the same innate learning biases. Previous work has attempted to answer it in the positive by comparing LLM learning curves on existing language datasets and on "impossible" datasets derived from them via various perturbation functions. Using the same methodology, we examine this claim on a wider set of languages and impossible perturbations. We find that in most cases, GPT-2 learns each language and its impossible counterpart equally easily, in contrast to previous claims. We also apply a more lenient condition by testing whether GPT-2 provides any kind of separation between the whole set of natural languages and the whole set of impossible languages. By considering cross-linguistic variance in various metrics computed on the perplexity curves, we show that GPT-2 provides no systematic separation between the possible and the impossible. Taken together, these perspectives show that LLMs do not share the human innate biases that shape linguistic typology.
Abstract（参考訳）: 大きな言語モデル(LLM)は、人間の可能な言語と人間の不可能な言語の区別に敏感か? この疑問は、LLMと人間は同じ本質的な学習バイアスを共有しているかどうかについて、多くの人に受け入れられている。これまでの研究は、既存の言語データセットのLLM学習曲線と、様々な摂動関数を介してそれらから派生した"不可能な"データセットを比較して、肯定的な回答を試みてきた。同じ手法を用いて、より広い言語群と不可能な摂動に対して、この主張を検証した。ほとんどの場合、GPT-2は、以前の主張とは対照的に、各言語とその不可能な言語を等しく容易に学習する。また、GPT-2が自然言語の集合全体と不可能な言語の集合全体の何らかの分離を提供するかどうかをテストすることにより、より寛大な条件を適用する。パープレキシティ曲線上で計算された様々な指標の言語間差異を考慮することにより、GPT-2は可能と不可能を体系的に分離することができないことを示す。まとめると、これらの視点は、LLMが言語型学を形成する人間固有のバイアスを共有していないことを示している。

論文の概要: Biasless Language Models Learn Unnaturally: How LLMs Fail to Distinguish the Possible from the Impossible

関連論文リスト