Fugu-MT 論文翻訳(概要): Fine-tuning vs. In-context Learning in Large Language Models: A Formal Language Learning Perspective

論文の概要: Fine-tuning vs. In-context Learning in Large Language Models: A Formal Language Learning Perspective

arxiv url: http://arxiv.org/abs/2604.23267v1
Date: Sat, 25 Apr 2026 12:19:25 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-28 17:12:07.236259
Title: Fine-tuning vs. In-context Learning in Large Language Models: A Formal Language Learning Perspective
Title（参考訳）: 大規模言語モデルにおける微調整と文脈内学習--形式言語学習の視点から
Authors: Bishwamittra Ghosh, Soumi Das, Till Speicher, Qinyuan Wu, Mohammad Aflah Khan, Deepak Garg, Krishna P. Gummadi, Evimaria Terzi,
Abstract要約: 大規模言語モデル(LLM)は、ファインチューニング(FT)とインコンテキスト学習(ICL)の2つの基本的な学習モードで動作する。 FTとICLを比較した以前の研究では、不整合な実験装置による混合と不整合の結果が得られた。本稿では,正確な言語境界,制御された文字列サンプリング,データ汚染のない形式的な言語学習タスクを提案する。
参考スコア（独自算出の注目度）: 15.30367035674579
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models (LLMs) operate in two fundamental learning modes - fine-tuning (FT) and in-context learning (ICL) - raising key questions about which mode yields greater language proficiency and whether they differ in their inductive biases. Prior studies comparing FT and ICL have yielded mixed and inconclusive results due to inconsistent experimental setups. To enable a rigorous comparison, we propose a formal language learning task - offering precise language boundaries, controlled string sampling, and no data contamination - and introduce a discriminative test for language proficiency, where an LLM succeeds if it assigns higher generation probability to in-language strings than to out-of-language strings. Empirically, we find that: (a) FT has greater language proficiency than ICL on in-distribution generalization, but both perform equally well on out-of-distribution generalization. (b) Their inductive biases, measured by the correlation in string generation probabilities, are similar when both modes partially learn the language but diverge at higher proficiency levels. (c) Unlike FT, ICL performance differs substantially across models of varying sizes and families and is sensitive to the token vocabulary of the language. Thus, our work demonstrates the promise of formal languages as a controlled testbed for evaluating LLMs, behaviors that are difficult to isolate in natural language datasets. Our source code is available at https://github.com/bishwamittra/formallm.
Abstract（参考訳）: 大規模言語モデル(LLM)は、微細チューニング(FT)とコンテキスト内学習(ICL)の2つの基本的な学習モードで動作し、どのモードがより優れた言語習熟度をもたらすか、そしてそれらが帰納バイアスで異なるかという重要な疑問を提起する。 FTとICLを比較した以前の研究では、不整合な実験装置による混合と不整合の結果が得られた。厳密な比較を可能にするために,厳密な言語学習タスクを提案する。言語境界,制御された文字列サンプリング,データ汚染のない形式的言語学習タスクと,LLMが言語外文字列よりも高い生成確率を付与した場合に,言語習熟度に対する識別的テストを導入する。経験的に、私たちはそれを見つけました。 (a)FTは分布内一般化においてICLよりも言語習熟度が高いが、どちらも分布外一般化において同等に機能する。 b) 2つのモードが言語を部分的に学習するが、高い習熟度で発散する場合には、それらの帰納バイアスは、文字列生成確率の相関によって測定される。 (c)FTとは異なり、ICLのパフォーマンスは様々なサイズや家族のモデルで大きく異なり、言語のトークン語彙に敏感である。そこで本研究では,LLMの評価のための制御テストベッドとして,自然言語データセットの分離が困難な動作として,形式言語が約束されることを示す。ソースコードはhttps://github.com/bishwamittra/formallm.comで公開されています。

論文の概要: Fine-tuning vs. In-context Learning in Large Language Models: A Formal Language Learning Perspective

関連論文リスト