Fugu-MT 論文翻訳(概要): Untying the Reversal Curse via Bidirectional Language Model Editing

論文の概要: Untying the Reversal Curse via Bidirectional Language Model Editing

arxiv url: http://arxiv.org/abs/2310.10322v2
Date: Sat, 12 Oct 2024 03:31:13 GMT
ステータス: 翻訳完了
システム内更新日: 2024-12-05 07:15:11.771671
Title: Untying the Reversal Curse via Bidirectional Language Model Editing
Title（参考訳）: 双方向言語モデル編集による逆曲線の解法
Authors: Jun-Yu Ma, Jia-Chen Gu, Zhen-Hua Ling, Quan Liu, Cong Liu,
Abstract要約: 大規模言語モデル(LLM)は、膨大な事実知識をパラメータに格納する。 LLMは、誤ったまたは時代遅れの知識のために意図しないテキストを幻覚させる傾向がある。本研究では、双方向言語モデル編集について検討し、LLMが双方向で編集知識をリコールできるかどうかを評価する。
参考スコア（独自算出の注目度）: 41.040662400025184
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent studies have demonstrated that large language models (LLMs) store massive factual knowledge within their parameters. But existing LLMs are prone to hallucinate unintended text due to false or outdated knowledge. Since retraining LLMs is resource intensive, there has been a growing interest in the concept of model editing. Despite the emergence of benchmarks and approaches, these unidirectional editing and evaluation have failed to explore the reversal curse. Intuitively, if "The capital of France is" is edited to be a counterfact "London" within a model, then it should be able to naturally reason and recall the reverse fact, i.e., "London is the capital of" followed by "France" instead of "England". In this paper, we study bidirectional language model editing, aiming to provide rigorous model editing evaluation to assess if edited LLMs can recall the editing knowledge bidirectionally. A new evaluation metric of reversibility is introduced, and a benchmark dubbed as Bidirectional Assessment for Knowledge Editing (BAKE) is constructed to evaluate the reversibility of edited models in recalling knowledge in the reverse direction of editing. We surprisingly observe that while current editing methods and LLMs can effectively recall editing facts in the direction of editing, they suffer serious deficiencies when evaluated in the reverse direction. To mitigate the reversal curse, a method named Bidirectionally Inversible Relationship moDeling (BIRD) is proposed. A set of editing objectives that incorporate bidirectional relationships between subject and object into the updated model weights are designed. Experiments show that BIRD improves the performance of four representative LLMs of different sizes via question answering and judgement.
Abstract（参考訳）: 近年の研究では、大規模言語モデル(LLM)がパラメータ内に膨大な事実知識を蓄積していることが示されている。しかし、既存のLSMは、誤った、または時代遅れの知識のために意図しないテキストを幻覚させる傾向がある。 LLMの再学習は資源集約的であるため、モデル編集の概念への関心が高まっている。ベンチマークやアプローチの出現にもかかわらず、これらの一方向の編集と評価は逆の呪いを探求することはできなかった。直感的には、もし「フランスの首都」がモデル内の反ファクト「ロンドン」として編集された場合、逆の事実、すなわち「ロンドンは首都」を「イングランド」の代わりに「フランス」として、自然に推論し、思い出させることができるはずである。本稿では, 双方向言語モデル編集について検討し, 厳密なモデル編集評価を提供することにより, 編集されたLLMが双方向に編集知識をリコールできるかどうかを評価することを目的とする。新たな可逆性評価指標を導入し、編集の逆方向の知識を想起する際の編集モデルの可逆性を評価するために、BAKE(Bidirectional Assessment for Knowledge Editing)と呼ばれるベンチマークを構築した。従来の編集手法やLLMは編集方向の編集事実を効果的にリコールできるが,逆方向の評価では深刻な欠陥を被る。逆の呪いを軽減するため,BIRD (Bidirectionally Inversible Relationship moDeling) という手法が提案されている。対象物と対象物の双方向関係を更新されたモデル重みに組み込んだ編集対象セットを設計する。実験の結果,BIRD は質問応答と判断により,異なる大きさの 4 種類の LLM の性能を向上させることが示された。

論文の概要: Untying the Reversal Curse via Bidirectional Language Model Editing

関連論文リスト