Fugu-MT 論文翻訳(概要): Can Factual Opinions Be Edited (Manipulated) in Large Language Models?

論文の概要: Can Factual Opinions Be Edited (Manipulated) in Large Language Models?

arxiv url: http://arxiv.org/abs/2606.03096v1
Date: Tue, 02 Jun 2026 03:35:00 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-03 22:00:04.732082
Title: Can Factual Opinions Be Edited (Manipulated) in Large Language Models?
Title（参考訳）: 大規模言語モデルで仮想オピニオンを編集(操作)することは可能か?
Authors: Yuanpu Cao, Ziyi Yin, Fenglong Ma, Jinghui Chen,
Abstract要約: 大規模言語モデル(LLM)は、様々なドメインに統合され、知識編集技術が重要かつ潜在的に危険である可能性がある。現在の編集方法は主として原子的な事実をターゲットとしており、社会問題に関する公的な人物のスタンスを文書化するなど、事実的意見を操作することに関連する重大なリスクを見落としている。本稿では,261の公開数字,19の課題カテゴリ,2,178の完全な意見記録を含むFactual Opinion Editing with Evidenceベンチマークを紹介する。本評価は,現在行われている編集技術が現実的な意見と大きく対立していることを示し,しばしば表面的変化のみを達成する一方で,編集された意見とモデルが生み出す支持的証拠との整合性を維持することに失敗することを示した。
参考スコア（独自算出の注目度）: 49.225790715935204
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large Language Models (LLMs) are increasingly integrated into various domains, making knowledge editing techniques crucial yet potentially hazardous. Current editing methods primarily target atomic facts, overlooking the significant risks associated with manipulating factual opinions, e.g., documented stances of public figures on societal issues. Such manipulation could reshape public images, influence elections, and alter societal views. To systematically assess this threat, we introduce the Factual Opinion Editing with Evidence (FOE) benchmark, which encompasses 261 public figures, 19 issue categories, and 2,178 complete opinion records. Our evaluations demonstrate that current editing techniques struggle significantly with factual opinions, often achieving only superficial changes while failing to preserve consistency between the edited opinion and the supporting evidence generated by the model. To address this limitation, we further propose a simple yet effective Self-Generated Evidence-Aligned method that achieves opinion-evidence alignment without relying on explicit instructions. Together, our benchmark and method provide a foundation for understanding the emerging security implications of factual opinion editing in LLMs.
Abstract（参考訳）: 大規模言語モデル(LLM)は、様々なドメインに統合され、知識編集技術が重要かつ潜在的に危険である可能性がある。現在の編集方法は、主に原子的な事実をターゲットとしており、事実的意見を操作することに関連する重大なリスクを見落としている。このような操作は、公的なイメージを再形成し、選挙に影響を及ぼし、社会的見解を変える可能性がある。この脅威を体系的に評価するために、261の公開数字、19のイシューカテゴリ、2,178の完全な世論記録を含むFactual Opinion Editing with Evidence (FOE)ベンチマークを導入する。本評価は,現在行われている編集技術が現実的な意見と大きく対立していることを示し,しばしば表面的変化のみを達成する一方で,編集された意見とモデルが生み出す支持的証拠との整合性を維持することに失敗することを示した。この制限に対処するために,明示的な指示に頼らずに意見のアライメントを実現する,シンプルで効果的な自己生成エビデンスアライメント手法を提案する。このベンチマークと手法は,LLMにおける現実的な意見編集のセキュリティへの影響を理解するための基盤となる。

論文の概要: Can Factual Opinions Be Edited (Manipulated) in Large Language Models?

関連論文リスト