Fugu-MT 論文翻訳(概要): ProMoral-Bench: Evaluating Prompting Strategies for Moral Reasoning and Safety in LLMs

論文の概要: ProMoral-Bench: Evaluating Prompting Strategies for Moral Reasoning and Safety in LLMs

arxiv url: http://arxiv.org/abs/2602.13274v1
Date: Thu, 05 Feb 2026 10:07:39 GMT
ステータス: 翻訳完了
システム内更新日: 2026-02-23 12:01:13.607197
Title: ProMoral-Bench: Evaluating Prompting Strategies for Moral Reasoning and Safety in LLMs
Title（参考訳）: ProMoral-Bench:LLMにおけるモラル推論と安全性のためのプロンプティング戦略の評価
Authors: Rohan Subramanian Thomas, Shikhar Shiromani, Abdullah Chaudhry, Ruizhe Li, Vasu Sharma, Kevin Zhu, Sunishchal Dev,
Abstract要約: ProMoral-Benchは4つの大言語モデル(LLM)にまたがるパラダイムを促進する11のベンチマークである。 ETHICS、Scruples、WildJailbreak、および新しいロバストネステストETHICS-Contrastを用いて、提案した統一モラル安全スコア(UMSS)を用いて性能を測定する。以上の結果から,コンパクトで先進的な足場は複雑な多段階推論よりも優れており,より高いUMSSスコアとより低いトークンコストでの堅牢性が得られることがわかった。
参考スコア（独自算出の注目度）: 8.459191693233148
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Prompt design significantly impacts the moral competence and safety alignment of large language models (LLMs), yet empirical comparisons remain fragmented across datasets and models.We introduce ProMoral-Bench, a unified benchmark evaluating 11 prompting paradigms across four LLM families. Using ETHICS, Scruples, WildJailbreak, and our new robustness test, ETHICS-Contrast, we measure performance via our proposed Unified Moral Safety Score (UMSS), a metric balancing accuracy and safety. Our results show that compact, exemplar-guided scaffolds outperform complex multi-stage reasoning, providing higher UMSS scores and greater robustness at a lower token cost. While multi-turn reasoning proves fragile under perturbations, few-shot exemplars consistently enhance moral stability and jailbreak resistance. ProMoral-Bench establishes a standardized framework for principled, cost-effective prompt engineering.
Abstract（参考訳）: ProMoral-Benchは、4つのLLMファミリーで11のパラダイムを評価可能な統一ベンチマークである。 ETHICS, Scruples, WildJailbreak, そして新しいロバストネステストETHICS-Contrastを用いて, 精度と安全性のバランスをとる指標であるUMSS(Unified Moral Safety Score)を用いて性能を測定する。以上の結果から,コンパクトで先進的な足場は複雑な多段階推論よりも優れており,より高いUMSSスコアとより低いトークンコストでの堅牢性が得られることがわかった。マルチターン推論は摂動の際の脆弱さを証明しているが、少数の例では道徳的安定性と脱獄抵抗を一貫して強化している。 ProMoral-Benchは、原則的で費用効果の高いプロンプトエンジニアリングのための標準化されたフレームワークを確立する。

論文の概要: ProMoral-Bench: Evaluating Prompting Strategies for Moral Reasoning and Safety in LLMs

関連論文リスト