Fugu-MT 論文翻訳(概要): Property-Guided LLM Program Synthesis for Planning

論文の概要: Property-Guided LLM Program Synthesis for Planning

arxiv url: http://arxiv.org/abs/2605.16142v2
Date: Mon, 18 May 2026 14:26:59 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-19 17:57:46.197721
Title: Property-Guided LLM Program Synthesis for Planning
Title（参考訳）: 計画立案のためのプロパティ誘導LDMプログラムの合成
Authors: André G. Pereira, Augusto B. Corrêa, Jendrik Seipp,
Abstract要約: 評価後にプログラムを採点する代わりに、候補が正式に定義された性質を満たすかどうかを確認する。このフィードバックは、プログラム生成数と評価コストの両方を大幅に削減します。
参考スコア（独自算出の注目度）: 6.88204255655161
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: LLMs have shown impressive success in program synthesis, discovering programs that surpass prior solutions. However, these approaches rely on simple numeric scores to signal program quality, such as the value of the solution or the number of passed tests. Because a score offers no guidance on why a program failed, the system must generate and evaluate many candidates hoping some succeed, increasing LLM inference and evaluation costs. We study a different approach: property-guided LLM program synthesis. Instead of scoring programs after evaluation, we check whether a candidate satisfies a formally defined property. When the property is violated, we stop the evaluation early and provide the LLM with a concrete counterexample showing exactly how the program failed. This feedback drastically reduces both the number of program generations and the evaluation cost, and can guide the LLM to generate stronger programs. We evaluate this approach on PDDL planning domains, asking the LLM to synthesize direct heuristic functions: every state reachable by strictly improving transitions has a strictly improving successor. A heuristic with this property leads hill-climbing algorithm directly to a goal state. A counterexample-guided repair loop generates one candidate program, checks the property over a training set, and returns the first case that violates the property. We evaluate our approach on ten planning domains with an out-of-distribution test set. The synthesized heuristics are effectively direct on virtually all test tasks, and compared to the best prior generation method our approach generates seven times fewer programs per domain on average, solves more tasks without using search, and requires several orders of magnitude less computation to evaluate candidates. Whenever a problem admits a verifiable property, property-guided LLM synthesis can reduce cost and improve program quality.
Abstract（参考訳）: LLMは、プログラム合成において顕著な成功を示し、以前のソリューションを超越したプログラムを発見した。しかしながら、これらのアプローチは、ソリューションの値やパステストの数など、プログラムの品質を示すための単純な数値スコアに依存している。スコアは、プログラムが失敗した理由に関するガイダンスを提供しないので、システムは、何らかの成功を期待する多くの候補を生成し、評価し、LSM推論と評価コストを増大させなければならない。我々は、プロパティ誘導LLMプログラム合成という、異なるアプローチについて研究する。評価後にプログラムを採点する代わりに、候補が正式に定義された性質を満たすかどうかを確認する。資産が侵害された場合、我々は早期に評価を中止し、プログラムの失敗を正確に示す具体的な反例をLSMに提供する。このフィードバックは、プログラム生成数と評価コストの両方を大幅に削減し、LCMがより強力なプログラムを生成するように誘導することができる。我々は、PDDLプランニング領域におけるこのアプローチを評価し、LCMに直接ヒューリスティック関数の合成を求める。この性質を持つヒューリスティックは、ヒルクライミングアルゴリズムを直接ゴール状態に導く。逆例誘導修理ループは、1つの候補プログラムを生成し、トレーニングセット上のプロパティをチェックし、そのプロパティに違反した最初のケースを返す。アウト・オブ・ディストリビューションテストセットを用いて,10のプランニング領域に対するアプローチを評価した。合成されたヒューリスティックスは、事実上全てのテストタスクに事実上直接的であり、最も優れた事前生成手法と比較して、我々の手法は、平均してドメイン当たりの7倍のプログラムを生成し、探索を使わずにより多くのタスクを解決し、候補を評価するために数桁の計算を必要とします。検証可能な性質を許容する問題が発生すると、プロパティ誘導LDM合成はコストを低減し、プログラム品質を向上させることができる。

論文の概要: Property-Guided LLM Program Synthesis for Planning

関連論文リスト