Fugu-MT 論文翻訳(概要): SciIntBench: Measuring LLM Compliance with Research Integrity Norms Under Adversarial Framing

論文の概要: SciIntBench: Measuring LLM Compliance with Research Integrity Norms Under Adversarial Framing

arxiv url: http://arxiv.org/abs/2605.29468v1
Date: Thu, 28 May 2026 07:00:01 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-30 00:00:30.947837
Title: SciIntBench: Measuring LLM Compliance with Research Integrity Norms Under Adversarial Framing
Title（参考訳）: SciIntBench: 逆フラーミング下における研究積分ノルムによるLCMコンプライアンスの測定
Authors: Almene De Meran Meguimtsop, Maria Leonor Pacheco, Daniel E. Acuna,
Abstract要約: 大規模言語モデル (LLM) は、科学的な研究を支援するためにますます使われている。責任ある研究行動規範(RCR)を支持するか、それらを損なう助けになるかは不明だ。 SciIntBenchは10のRCRカテゴリと3つの科学的領域にまたがる810個のプロンプトの逆ベンチマークである。
参考スコア（独自算出の注目度）: 6.108996188955891
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models (LLMs) are increasingly used to support scientific work, but it is unclear whether they uphold responsible conduct of research (RCR) norms or help undermine them. We introduce SciIntBench, an adversarial benchmark of 810 prompts across ten RCR categories and three scientific domains. Each scenario appears as an Overt Adversarial, Covert Adversarial, and Benign version, allowing us to jointly measure framing-sensitive refusal of misconduct and helpfulness on legitimate requests. We evaluate 16 commercial and open-weight LLMs from six providers (2024--2026), producing 12,960 responses. We find that scientific integrity alignment is strongly framing-sensitive: models refuse explicit misconduct far more reliably than covert violations, especially failing when misconduct is presented as a pressure-driven shortcut. Refusals vary by RCR category, with weaker boundaries around transparency, plagiarism, and fabrication.
Abstract（参考訳）: 大規模言語モデル(LLM)は、科学的な研究を支援するためにますます使われているが、それらが責任ある研究の規範(RCR)を守っているかどうかは不明である。 SciIntBenchは10のRCRカテゴリと3つの科学的領域にまたがる810個のプロンプトの逆ベンチマークである。各シナリオはOvert Adversarial、Covert Adversarial、Benignバージョンとして現れます。 6 つのプロバイダ (2024-2026) から16 個の商用 LLM とオープンウェイト LLM を評価し,12,960 の応答を示した。モデルは、特に圧力駆動のショートカットとして提示された場合に、隠蔽違反よりも明確な不正行為をはるかに確実に拒否する。拒絶はRCRカテゴリーによって異なり、透明性、盗作、製造に関する境界が弱い。

論文の概要: SciIntBench: Measuring LLM Compliance with Research Integrity Norms Under Adversarial Framing

関連論文リスト