Fugu-MT 論文翻訳(概要): Benchmarking Knowledge Editing using Logical Rules

論文の概要: Benchmarking Knowledge Editing using Logical Rules

arxiv url: http://arxiv.org/abs/2606.10554v1
Date: Tue, 09 Jun 2026 08:21:56 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-10 15:40:58.388573
Title: Benchmarking Knowledge Editing using Logical Rules
Title（参考訳）: 論理規則を用いた知識編集のベンチマーク
Authors: Tatiana Moteu Ngoli, NDah Jean Kouagou, Hamada M. Zahera, Axel-Cyrille Ngonga Ngomo,
Abstract要約: 本稿では,知識編集手法が単一事実編集の論理的結果をどのように扱うかを評価するための新しいベンチマークを提案する。既存の知識編集手法は, LLMに直接アサーションを正確に挿入できるが, 必要な知識を注入できない場合が多いことが示唆された。これは知識編集におけるセマンティクスを意識した評価フレームワークの重要性を浮き彫りにする。
参考スコア（独自算出の注目度）: 6.42566059684319
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Models (LLMs) are increasingly deployed in real-world applications that require access to up-to-date knowledge. However, retraining LLMs is computationally expensive. Therefore, knowledge editing techniques are crucial for maintaining current information and correcting erroneous assertions within pre-trained models. Current benchmarks for knowledge editing primarily focus on recalling edited facts, often neglecting their logical consequences. To address this limitation, we introduce a new benchmark designed to evaluate how knowledge editing methods handle the logical consequences of a single fact edit. Our benchmark extracts relevant logical rules from a knowledge graph for a given edit. Then, it generates multi-hop questions based on these rules to assess the impact on logical consequences. Our findings indicate that while existing knowledge editing approaches can accurately insert direct assertions into LLMs, they frequently fail to inject entailed knowledge. Specifically, experiments with popular methods like ROME and FT reveal a substantial performance gap, up to 24%, between evaluations on directly edited knowledge and on entailed knowledge. This highlights the critical need for semantics-aware evaluation frameworks in knowledge editing.
Abstract（参考訳）: 大規模言語モデル(LLM)は、最新の知識へのアクセスを必要とする現実世界のアプリケーションにますます多くデプロイされている。しかし、再学習 LLM は計算コストが高い。したがって、知識編集技術は、事前訓練されたモデル内で、現在の情報を維持し、誤った主張を修正するために不可欠である。知識編集の現在のベンチマークは、主に編集された事実のリコールに焦点を当てており、しばしばその論理的な結果を無視している。この制限に対処するために,知識編集手法が単一事実編集の論理的結果をどのように扱うかを評価するために設計された新しいベンチマークを導入する。本ベンチマークでは,知識グラフから関連する論理ルールを抽出して編集する。そして、これらのルールに基づいてマルチホップ質問を生成し、論理的結果への影響を評価する。既存の知識編集手法は, LLMに直接アサーションを正確に挿入できるが, 必要な知識を注入できない場合が多いことが示唆された。具体的には、ROMEやFTといった一般的な手法による実験では、直接編集された知識と関連する知識の評価の間に、パフォーマンスのギャップが最大24%あることが示されている。これは知識編集におけるセマンティクスを意識した評価フレームワークの重要性を浮き彫りにする。

論文の概要: Benchmarking Knowledge Editing using Logical Rules

関連論文リスト