Fugu-MT 論文翻訳(概要): Using Probabilistic Programs to Train Inductive Reasoning in Large Language Models

論文の概要: Using Probabilistic Programs to Train Inductive Reasoning in Large Language Models

arxiv url: http://arxiv.org/abs/2606.09856v1
Date: Tue, 26 May 2026 15:40:02 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-15 07:09:36.812438
Title: Using Probabilistic Programs to Train Inductive Reasoning in Large Language Models
Title（参考訳）: 確率的プログラムを用いた大規模言語モデルにおける帰納的推論の訓練
Authors: Liyi Zhang, Akshay K. Jagadish, Brenden M. Lake, Thomas L. Griffiths,
Abstract要約: 多くの現実世界の推論問題は帰納的であり、エージェントは曖昧で曖昧な観察から不確実な信念を推測しなければならない。帰納的推論に標準的な微調整法を用いるには課題がある。これらの制約に対処するために,PPT(Program Posterior Training)を導入する。
参考スコア（独自算出の注目度）: 10.489066116287221
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Post-training Large Language Models (LLMs) for reasoning typically focuses on deductive tasks such as mathematics and coding where correctness is verifiable. Yet, many real-world reasoning problems are inductive: agents must infer uncertain beliefs from sparse, ambiguous observations. There are challenges to using standard fine-tuning methods for inductive reasoning, including difficulties in curating large-scale, high-quality labeled datasets and in handling targets that are inherently distributional. In this work, we introduce a novel approach, called Program-based Posterior Training (PPT), to address these limitations: we use an LLM to generate diverse open-world scenarios as probabilistic programs, run probabilistic inference to produce distributional target responses to queries, and then fine-tune on these probabilistic soft labels. Using this approach, we fine-tune LLMs on 10,000 programmatically generated scenarios and evaluate on held-out motifs, human-labeled judgments, and external benchmarks. Overall, PPT substantially improves estimation accuracy on held-out inductive tasks, increases alignment with human judgments, and transfers to external benchmarks for estimation and calibration. Additionally, the gains in raw calibration are not subsumed by post-hoc temperature scaling, showing that the models have more deeply internalized uncertainty compared to output rescaling. Together, these results suggest that probabilistic-program-mediated fine-tuning is a promising approach for post-training LLMs to reliably perform approximate inductive inference.
Abstract（参考訳）: 推論のための訓練後の大規模言語モデル(LLM)は一般的に、正確さが検証可能な数学やコーディングのような演能的なタスクに焦点を当てる。しかし、現実の推論問題の多くは帰納的であり、エージェントは不明瞭で曖昧な観察から不確実な信念を推測しなければならない。インダクティブ推論に標準的な微調整手法を使用することには、大規模で高品質なラベル付きデータセットのキュレーションの難しさや、本質的に分散的なターゲットの扱いなど、課題がある。本研究では,プログラムベースポストリアトレーニング(PPT)と呼ばれる新しい手法を導入し,これらの制約に対処する。LLMを用いて確率的プログラムとして多様なオープンワールドシナリオを生成し,確率論的推論を行い,クエリに対する分布的ターゲット応答を生成し,それらの確率的ソフトラベルを微調整する。このアプローチを用いて,プログラムで生成したシナリオ1万件のLCMを微調整し,ホールドアウトモチーフ,人間ラベルによる判断,および外部ベンチマークで評価する。全体として、PTは保持された帰納的タスクの推定精度を大幅に改善し、人間の判断との整合性を高め、見積もりとキャリブレーションのための外部ベンチマークに転送する。さらに, 熱後温度スケーリングでは生キャリブレーションの利得は仮定されず, モデルが出力再スケーリングよりも深い内部不確実性を有することを示す。これらの結果から,確率的プログラムによる微調整は,学習後のLLMが近似帰納的推論を確実に行う上で有望な手法であることが示唆された。

論文の概要: Using Probabilistic Programs to Train Inductive Reasoning in Large Language Models

関連論文リスト