Fugu-MT 論文翻訳(概要): Behemoth: Benchmarking Unlearning in LLMs Using Fully Synthetic Data

論文の概要: Behemoth: Benchmarking Unlearning in LLMs Using Fully Synthetic Data

arxiv url: http://arxiv.org/abs/2601.23153v1
Date: Fri, 30 Jan 2026 16:39:42 GMT
ステータス: 翻訳完了
システム内更新日: 2026-02-02 18:28:15.562955
Title: Behemoth: Benchmarking Unlearning in LLMs Using Fully Synthetic Data
Title（参考訳）: Behemoth: 完全な合成データを用いたLLMにおけるアンラーニングのベンチマーク
Authors: Eugenia Iofinova, Dan Alistarh,
Abstract要約: 実世界のデータに基づいて学習した大規模言語モデルに対するモデル編集の効果を理解するためのフレームワークであるBehemothを提案する。例えば、いくつかのケースでは、現実世界の結果を反響させて、更新ランクを制限することで、より効果的な更新結果が得られることを示しています。
参考スコア（独自算出の注目度）: 43.026389128544594
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: As artificial neural networks, and specifically large language models, have improved rapidly in capabilities and quality, they have increasingly been deployed in real-world applications, from customer service to Google search, despite the fact that they frequently make factually incorrect or undesirable statements. This trend has inspired practical and academic interest in model editing, that is, in adjusting the weights of the model to modify its likely outputs for queries relating to a specific fact or set of facts. This may be done either to amend a fact or set of facts, for instance, to fix a frequent error in the training data, or to suppress a fact or set of facts entirely, for instance, in case of dangerous knowledge. Multiple methods have been proposed to do such edits. However, at the same time, it has been shown that such model editing can be brittle and incomplete. Moreover the effectiveness of any model editing method necessarily depends on the data on which the model is trained, and, therefore, a good understanding of the interaction of the training data distribution and the way it is stored in the network is necessary and helpful to reliably perform model editing. However, working with large language models trained on real-world data does not allow us to understand this relationship or fully measure the effects of model editing. We therefore propose Behemoth, a fully synthetic data generation framework. To demonstrate the practical insights from the framework, we explore model editing in the context of simple tabular data, demonstrating surprising findings that, in some cases, echo real-world results, for instance, that in some cases restricting the update rank results in a more effective update. The code is available at https://github.com/IST-DASLab/behemoth.git.
Abstract（参考訳）: 人工知能、特に大きな言語モデルは、能力と品質が急速に向上しているため、顧客サービスからGoogle検索まで、現実のアプリケーションにデプロイされることが増えている。この傾向は、モデルの重みを調整して、特定の事実や事実の集合に関連するクエリの出力を変更することに、実用的および学術的な関心を惹き付けた。これは、例えば、トレーニングデータの頻繁なエラーを修正するために、事実または事実のセットを修正するために、または、危険な知識の場合に、事実または事実のセットを完全に抑制するために行われる。このような編集を行うために複数の方法が提案されている。しかし同時に、そのようなモデル編集は脆く不完全であることが示されている。さらに、モデル編集手法の有効性は、必ずしもモデルがトレーニングされたデータに依存するため、トレーニングデータ配信とネットワークに格納される方法の相互作用を十分に理解し、モデル編集を確実に行う必要がある。しかし、実世界のデータに基づいて訓練された大規模言語モデルでは、この関係を理解したり、モデル編集の効果を十分に測定することはできない。そこで我々は,完全合成データ生成フレームワークであるBehemothを提案する。このフレームワークの実践的な洞察を実証するため、単純な表形式のデータという文脈でモデル編集を検証し、いくつかのケースでは現実世界の結果、例えば更新ランクの制限がより効果的な更新をもたらすという驚くべき結果を示す。コードはhttps://github.com/IST-DASLab/behemoth.gitで公開されている。

論文の概要: Behemoth: Benchmarking Unlearning in LLMs Using Fully Synthetic Data

関連論文リスト