Fugu-MT 論文翻訳(概要): Transformer-Patcher: One Mistake worth One Neuron

論文の概要: Transformer-Patcher: One Mistake worth One Neuron

arxiv url: http://arxiv.org/abs/2301.09785v1
Date: Tue, 24 Jan 2023 02:12:42 GMT
ステータス: 翻訳完了
システム内更新日: 2023-01-25 14:40:11.263396
Title: Transformer-Patcher: One Mistake worth One Neuron
Title（参考訳）: Transformer-Patcher:ニューロン1個分のミス
Authors: Zeyu Huang, Yikang Shen, Xiaofeng Zhang, Jie Zhou, Wenge Rong, Zhang Xiong
Abstract要約: AIサービスのデプロイには、相変わらずの間違いがあり、時間内に修正されない場合、同じ間違いが再発生する可能性がある。トランスフォーマー・パッチ(Transformer-Patcher)は、トランスフォーマーをベースとしたモデルの振舞いを、数個のニューロンの追加とトレーニングによって変化させることができる新しいモデルエディタである。提案手法は,従来の微調整およびハイパーネットワークに基づく手法より優れ,逐次モデル編集(SME)の最先端性能を実現する。
参考スコア（独自算出の注目度）: 40.04159325505842
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large Transformer-based Pretrained Language Models (PLMs) dominate almost all Natural Language Processing (NLP) tasks. Nevertheless, they still make mistakes from time to time. For a model deployed in an industrial environment, fixing these mistakes quickly and robustly is vital to improve user experiences. Previous works formalize such problems as Model Editing (ME) and mostly focus on fixing one mistake. However, the one-mistake-fixing scenario is not an accurate abstraction of the real-world challenge. In the deployment of AI services, there are ever-emerging mistakes, and the same mistake may recur if not corrected in time. Thus a preferable solution is to rectify the mistakes as soon as they appear nonstop. Therefore, we extend the existing ME into Sequential Model Editing (SME) to help develop more practical editing methods. Our study shows that most current ME methods could yield unsatisfying results in this scenario. We then introduce Transformer-Patcher, a novel model editor that can shift the behavior of transformer-based models by simply adding and training a few neurons in the last Feed-Forward Network layer. Experimental results on both classification and generation tasks show that Transformer-Patcher can successively correct up to thousands of errors (Reliability) and generalize to their equivalent inputs (Generality) while retaining the model's accuracy on irrelevant inputs (Locality). Our method outperforms previous fine-tuning and HyperNetwork-based methods and achieves state-of-the-art performance for Sequential Model Editing (SME). The code is available at https://github.com/ZeroYuHuang/Transformer-Patcher.
Abstract（参考訳）: 大規模トランスフォーマーベースの事前訓練言語モデル(PLM)が、ほぼすべての自然言語処理(NLP)タスクを支配している。それでも、彼らは時々間違いを犯します。産業環境にデプロイされたモデルの場合、これらのミスを迅速かつ堅牢に修正することは、ユーザエクスペリエンスを改善する上で不可欠です。以前の作業では、モデル編集(ME)のような問題を形式化し、主に1つのミスの修正に重点を置いています。しかし、ワンミス修正シナリオは現実世界の課題の正確な抽象化ではない。 aiサービスのデプロイでは、繰り返し発生する間違いがあり、修正が間に合わなければ同じ間違いが再発する可能性がある。したがって、望ましい解決策は、失敗が止まらないように見えるとすぐに修正することです。そこで我々は,既存のMEを逐次モデル編集(SME)に拡張し,より実用的な編集手法の開発を支援する。我々の研究は、現在のmeメソッドのほとんどが、このシナリオで不満足な結果が得られることを示している。次にtransformer-patcherを紹介する。transformer-patcherは、最後のフィードフォワードネットワーク層に数個のニューロンを追加してトレーニングするだけで、transformerベースのモデルの振る舞いをシフトできる新しいモデルエディタである。分類タスクと生成タスクの両方の実験結果から、Transformer-Patcherは数千のエラー(信頼性)を逐次修正し、その等価な入力(一般性)に一般化し、無関係な入力(ローカリティ)に対するモデルの精度を維持する。提案手法は,従来の微調整およびハイパーネットワークに基づく手法より優れ,逐次モデル編集(SME)の最先端性能を実現する。コードはhttps://github.com/zeroyuhuang/transformer-patcherで入手できる。

論文の概要: Transformer-Patcher: One Mistake worth One Neuron

関連論文リスト