Fugu-MT 論文翻訳(概要): Zero-Shot Goal Recognition with Large Language Models

論文の概要: Zero-Shot Goal Recognition with Large Language Models

arxiv url: http://arxiv.org/abs/2605.15333v1
Date: Thu, 14 May 2026 18:56:06 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-18 21:22:26.06341
Title: Zero-Shot Goal Recognition with Large Language Models
Title（参考訳）: 大規模言語モデルを用いたゼロショットゴール認識
Authors: Kin Max Piamolini Gusmão, Nathan Gavenski, Nir Oren, Felipe Meneguzzi,
Abstract要約: 大規模言語モデルは、よく知られた計画領域における古典的なプランナーとほぼ一致している。ゴール認識は、LLM強度によく適合する相補的帰納的タスクである。本稿では,主要なPDDLベンチマーク上でのゴール認識として,フロンティアLCMのゼロショット評価を行う。
参考スコア（独自算出の注目度）: 6.023276947115864
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models have recently reached near-parity with classical planners on well-known planning domains, yet this competence relies on world-knowledge exploitation rather than genuine symbolic reasoning. Goal recognition is a complementary abductive task structurally better suited to LLM strengths: it consists of evaluating consistency with world knowledge rather than generating novel action sequences. This paper provides the first systematic zero-shot evaluation of frontier LLMs as goal recognisers on key classical PDDL benchmarks. Our results show that LLM competence on goal recognition is uneven: some models scale with evidence and approach landmark-based accuracy at full observations, while others remain anchored to world-knowledge priors regardless of how much evidence accumulates. Qualitative analysis of model reasoning traces reveals that this divergence reflects a fundamental difference in evidence integration rather than domain familiarity. These findings position goal recognition as a principled benchmark for the foundational planning knowledge of LLMs.
Abstract（参考訳）: 大規模言語モデルは、よく知られた計画領域の古典的プランナーとほぼ一致しているが、この能力は真の象徴的推論ではなく、世界知識による搾取に依存している。ゴール認識(Goal recognition)は、LLMの強みによく適合する補完的帰納的タスクであり、新しいアクションシーケンスを生成するのではなく、世界知識との整合性を評価することである。本稿では,従来のPDDLベンチマークにおいて,ゴール認識器としてフロンティアLSMを初めて体系的にゼロショット評価する。結果から,LLMの目標認識能力は不均一であることが示唆された。いくつかのモデルでは,証拠の蓄積量に関わらず,実測値と一致し,ランドマークに基づく精度にアプローチする。モデル推論トレースの定性的解析により、この発散は、領域に親しみやすいというよりは、エビデンス統合の根本的な違いを反映していることが明らかになった。これらの知見は, LLMの基礎的計画知識の基準として, 目標認識を位置づけた。

論文の概要: Zero-Shot Goal Recognition with Large Language Models

関連論文リスト