Fugu-MT 論文翻訳(概要): Concept Unlearning in Large Language Models via Self-Constructed Knowledge Triplets

論文の概要: Concept Unlearning in Large Language Models via Self-Constructed Knowledge Triplets

arxiv url: http://arxiv.org/abs/2509.15621v1
Date: Fri, 19 Sep 2025 05:34:45 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-22 18:18:11.01395
Title: Concept Unlearning in Large Language Models via Self-Constructed Knowledge Triplets
Title（参考訳）: 自己構築型知識トリプレットによる大規模言語モデルの概念学習
Authors: Tomoya Yamashita, Yuuki Yamanaka, Masanori Yamada, Takayuki Miura, Toshiki Shibahara, Tomoharu Iwata,
Abstract要約: 本研究では,大規模言語モデル(LLM)のアンラーニングの新たな要件として概念アンラーニング(CU)を導入する。我々は、LLMの内部知識を表現するために知識グラフを活用し、CUを、忘れられるターゲットノードと関連するエッジを取り除くものとして定義する。本手法は,学習過程とLLMの内部知識表現を整合させることにより,より正確で包括的な概念の除去を可能にする。
参考スコア（独自算出の注目度）: 20.968820590988333
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Machine Unlearning (MU) has recently attracted considerable attention as a solution to privacy and copyright issues in large language models (LLMs). Existing MU methods aim to remove specific target sentences from an LLM while minimizing damage to unrelated knowledge. However, these approaches require explicit target sentences and do not support removing broader concepts, such as persons or events. To address this limitation, we introduce Concept Unlearning (CU) as a new requirement for LLM unlearning. We leverage knowledge graphs to represent the LLM's internal knowledge and define CU as removing the forgetting target nodes and associated edges. This graph-based formulation enables a more intuitive unlearning and facilitates the design of more effective methods. We propose a novel method that prompts the LLM to generate knowledge triplets and explanatory sentences about the forgetting target and applies the unlearning process to these representations. Our approach enables more precise and comprehensive concept removal by aligning the unlearning process with the LLM's internal knowledge representations. Experiments on real-world and synthetic datasets demonstrate that our method effectively achieves concept-level unlearning while preserving unrelated knowledge.
Abstract（参考訳）: マシン・アンラーニング(MU)は、最近、大きな言語モデル(LLM)におけるプライバシーと著作権の問題に対する解決策として、かなりの注目を集めている。既存のMU手法は、無関係な知識に対するダメージを最小限に抑えつつ、特定の目標文をLLMから除去することを目的としている。しかし、これらのアプローチには明確な目標文が必要であり、人や出来事といったより広い概念の除去をサポートしていない。この制限に対処するため,LLMアンラーニングの新しい要件として概念アンラーニング(CU)を導入する。我々は、LLMの内部知識を表現するために知識グラフを活用し、CUを、忘れられるターゲットノードと関連するエッジを取り除くものとして定義する。このグラフベースの定式化は、より直感的な未学習を可能にし、より効果的なメソッドの設計を容易にする。本稿では,LLMに対して,忘れる対象に関する知識三重項と説明文を生成し,その表現に未学習プロセスを適用する新しい手法を提案する。本手法は,学習過程とLLMの内部知識表現を整合させることにより,より正確で包括的な概念の除去を可能にする。実世界のデータセットと合成データセットの実験により,無関係な知識を保ちながら概念レベルのアンラーニングを効果的に実現することを示した。

論文の概要: Concept Unlearning in Large Language Models via Self-Constructed Knowledge Triplets

関連論文リスト