Fugu-MT 論文翻訳(概要): Normative Reasoning in Large Language Models: A Comparative Benchmark from Logical and Modal Perspectives

論文の概要: Normative Reasoning in Large Language Models: A Comparative Benchmark from Logical and Modal Perspectives

arxiv url: http://arxiv.org/abs/2510.26606v1
Date: Thu, 30 Oct 2025 15:35:13 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-31 16:05:09.884496
Title: Normative Reasoning in Large Language Models: A Comparative Benchmark from Logical and Modal Perspectives
Title（参考訳）: 大規模言語モデルにおけるノルマ的推論:論理的およびモーダル的視点による比較ベンチマーク
Authors: Kentaro Ozeki, Risako Ando, Takanobu Morishita, Hirohiko Abe, Koji Mineshima, Mitsuhiro Okada,
Abstract要約: 論理的, モーダル的両面から, 規範的領域における大言語モデルの推論能力を評価する。以上の結果から, LLMは一般的に妥当な推論パターンに従属するが, 特定の規範的推論において顕著な矛盾が認められた。
参考スコア（独自算出の注目度）: 5.120890045747202
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Normative reasoning is a type of reasoning that involves normative or deontic modality, such as obligation and permission. While large language models (LLMs) have demonstrated remarkable performance across various reasoning tasks, their ability to handle normative reasoning remains underexplored. In this paper, we systematically evaluate LLMs' reasoning capabilities in the normative domain from both logical and modal perspectives. Specifically, to assess how well LLMs reason with normative modals, we make a comparison between their reasoning with normative modals and their reasoning with epistemic modals, which share a common formal structure. To this end, we introduce a new dataset covering a wide range of formal patterns of reasoning in both normative and epistemic domains, while also incorporating non-formal cognitive factors that influence human reasoning. Our results indicate that, although LLMs generally adhere to valid reasoning patterns, they exhibit notable inconsistencies in specific types of normative reasoning and display cognitive biases similar to those observed in psychological studies of human reasoning. These findings highlight challenges in achieving logical consistency in LLMs' normative reasoning and provide insights for enhancing their reliability. All data and code are released publicly at https://github.com/kmineshima/NeuBAROCO.
Abstract（参考訳）: 規範的推論(英: Normative reasoning)とは、義務や許可など、規範的または非合法的なモダリティを含む推論の一種である。大規模言語モデル(LLM)は、様々な推論タスクにおいて顕著な性能を示してきたが、規範的推論を扱う能力はいまだ探索されていない。本稿では,論理的およびモーダル的両面から,規範的領域におけるLLMの推論能力を体系的に評価する。具体的には, LLM がノルムモーダルとどの程度の理性を持つかを評価するために, ノルムモーダルとの理性比較と, 共通の形式的構造を持つてんかんモーダルとの理性比較を行う。そこで本研究では,ヒトの推論に影響を与える非形式的認知要因を取り入れつつ,規範的領域と認識的領域の両方における推論の多岐にわたる形式的パターンをカバーする新しいデータセットを提案する。以上の結果から, LLMは一般的に妥当な推論パターンに従属するが, 特定の規範的推論には矛盾がみられ, 人間の推論の心理学的研究と類似した認知バイアスがみられた。これらの知見は,LLMの規範的推論において論理的整合性を達成し,信頼性を高めるための洞察を提供する上での課題を浮き彫りにしている。すべてのデータとコードはhttps://github.com/kmineshima/NeuBAROCO.comで公開されている。

論文の概要: Normative Reasoning in Large Language Models: A Comparative Benchmark from Logical and Modal Perspectives

関連論文リスト