Fugu-MT 論文翻訳(概要): Data Poisoning Attacks Against Multimodal Encoders

論文の概要: Data Poisoning Attacks Against Multimodal Encoders

arxiv url: http://arxiv.org/abs/2209.15266v2
Date: Mon, 5 Jun 2023 13:52:24 GMT
ステータス: 翻訳完了
システム内更新日: 2023-06-07 04:32:52.029662
Title: Data Poisoning Attacks Against Multimodal Encoders
Title（参考訳）: マルチモーダルエンコーダに対するデータ中毒攻撃
Authors: Ziqing Yang and Xinlei He and Zheng Li and Michael Backes and Mathias Humbert and Pascal Berrang and Yang Zhang
Abstract要約: 視覚と言語の両方において多モーダルモデルに対する中毒攻撃について検討する。攻撃を緩和するため,前訓練と後訓練の両方の防御策を提案する。
参考スコア（独自算出の注目度）: 24.02062380303139
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recently, the newly emerged multimodal models, which leverage both visual and linguistic modalities to train powerful encoders, have gained increasing attention. However, learning from a large-scale unlabeled dataset also exposes the model to the risk of potential poisoning attacks, whereby the adversary aims to perturb the model's training data to trigger malicious behaviors in it. In contrast to previous work, only poisoning visual modality, in this work, we take the first step to studying poisoning attacks against multimodal models in both visual and linguistic modalities. Specially, we focus on answering two questions: (1) Is the linguistic modality also vulnerable to poisoning attacks? and (2) Which modality is most vulnerable? To answer the two questions, we propose three types of poisoning attacks against multimodal models. Extensive evaluations on different datasets and model architectures show that all three attacks can achieve significant attack performance while maintaining model utility in both visual and linguistic modalities. Furthermore, we observe that the poisoning effect differs between different modalities. To mitigate the attacks, we propose both pre-training and post-training defenses. We empirically show that both defenses can significantly reduce the attack performance while preserving the model's utility.
Abstract（参考訳）: 近年、視覚と言語の両方のモダリティを利用して強力なエンコーダを訓練するマルチモーダルモデルが注目されている。しかし、大規模なラベルのないデータセットから学習することで、モデルが潜在的な中毒攻撃のリスクに晒される可能性があるため、敵はモデルのトレーニングデータを混乱させ、悪意のある行動を引き起こすことを目指している。これまでの研究とは対照的に, 視覚的モダリティに限って, 視覚的モダリティと言語的モダリティの両方において, マルチモーダルモデルに対する中毒攻撃を研究するための第一歩を踏み出した。具体的には,(1) 言語的モダリティは毒殺攻撃にも弱いか,という2つの問いに答えることに焦点を当てる。そして、(2)どのモダリティが最も脆弱か? そこで本研究では,マルチモーダルモデルに対する3種類の毒殺攻撃を提案する。さまざまなデータセットとモデルアーキテクチャに関する広範な評価は、視覚と言語の両方でモデルユーティリティを維持しながら、すべての3つの攻撃が重要な攻撃性能を達成できることを示している。さらに, 中毒効果は, 異なる形態によって異なることが観察された。攻撃を緩和するため,前訓練と後訓練の両方の防御策を提案する。いずれの防御も,モデルの実用性を維持しつつ攻撃性能を著しく低下させることを実証的に示す。

関連論文リスト

DUMB and DUMBer: Is Adversarial Training Worth It in the Real World? [15.469010487781931]
敵の例は小さく、しばしば、愚かな機械学習モデルのために作られた、知覚不能な摂動である。侵入攻撃(英語: Evasion attack)とは、入力がテスト時に誤分類を引き起こすように修正される敵攻撃の一種であり、その伝達性のために特に不快である。本稿では,DUMB法の基礎の上に構築された攻撃フレームワークであるDUMBerを紹介し,敵の訓練したモデルの弾力性を評価する。
論文参考訳（メタデータ） (2025-06-23T11:16:21Z)
Deferred Poisoning: Making the Model More Vulnerable via Hessian Singularization [39.37308843208039]
我々は、より脅迫的なタイプの毒殺攻撃(Dederred Poisoning Attack)を導入する。この新たな攻撃により、モデルは通常、トレーニングと検証フェーズで機能するが、回避攻撃や自然騒音に非常に敏感になる。提案手法の理論的および実証的な解析を行い、画像分類タスクの実験を通してその効果を検証した。
論文参考訳（メタデータ） (2024-11-06T08:27:49Z)
PoisonBench: Assessing Large Language Model Vulnerability to Data Poisoning [32.508939142492004]
我々は、好み学習中のデータ中毒に対する大規模言語モデルの感受性を評価するためのベンチマークであるPoisonBenchを紹介する。データ中毒攻撃は、隠れた悪意のあるコンテンツやバイアスを含むために、大きな言語モデルレスポンスを操作することができる。 8つの現実的なシナリオに2つの異なる攻撃タイプをデプロイし、21の広く使用されているモデルを評価します。
論文参考訳（メタデータ） (2024-10-11T13:50:50Z)
Universal Vulnerabilities in Large Language Models: Backdoor Attacks for In-context Learning [14.011140902511135]
In-context Learningは、事前学習と微調整のギャップを埋めるパラダイムであり、いくつかのNLPタスクにおいて高い有効性を示している。広く適用されているにもかかわらず、コンテキスト内学習は悪意のある攻撃に対して脆弱である。我々は、コンテキスト内学習に基づく大規模言語モデルをターゲットに、ICLAttackという新しいバックドアアタック手法を設計する。
論文参考訳（メタデータ） (2024-01-11T14:38:19Z)
SA-Attack: Improving Adversarial Transferability of Vision-Language Pre-training Models via Self-Augmentation [56.622250514119294]
ホワイトボックスの敵攻撃とは対照的に、転送攻撃は現実世界のシナリオをより反映している。本稿では,SA-Attackと呼ばれる自己拡張型転送攻撃手法を提案する。
論文参考訳（メタデータ） (2023-12-08T09:08:50Z)
Investigating Human-Identifiable Features Hidden in Adversarial Perturbations [54.39726653562144]
我々の研究では、最大5つの攻撃アルゴリズムを3つのデータセットにわたって探索する。対人摂動における人間の識別可能な特徴を同定する。画素レベルのアノテーションを用いて、そのような特徴を抽出し、ターゲットモデルに妥協する能力を実証する。
論文参考訳（メタデータ） (2023-09-28T22:31:29Z)
Rethinking Model Ensemble in Transfer-based Adversarial Attacks [46.82830479910875]
転送可能性を改善する効果的な戦略は、モデルのアンサンブルを攻撃することである。これまでの作業は、単に異なるモデルの出力を平均化するだけであった。我々は、より移動可能な敵の例を生成するために、CWA(Common Weakness Attack)を提案する。
論文参考訳（メタデータ） (2023-03-16T06:37:16Z)
Can Adversarial Examples Be Parsed to Reveal Victim Model Information? [62.814751479749695]
本研究では,データ固有の敵インスタンスから,データに依存しない被害者モデル(VM)情報を推測できるかどうかを問う。我々は,135件の被害者モデルから生成された7種類の攻撃に対して,敵攻撃のデータセットを収集する。単純な教師付きモデル解析ネットワーク(MPN)は、見えない敵攻撃からVM属性を推測できることを示す。
論文参考訳（メタデータ） (2023-03-13T21:21:49Z)
Learning to Attack: Towards Textual Adversarial Attacking in Real-world Situations [81.82518920087175]
敵攻撃は、敵の例でディープニューラルネットワークを騙すことを目的としている。本稿では、攻撃履歴から学習し、より効率的に攻撃を開始することができる強化学習に基づく攻撃モデルを提案する。
論文参考訳（メタデータ） (2020-09-19T09:12:24Z)
Two Sides of the Same Coin: White-box and Black-box Attacks for Transfer Learning [60.784641458579124]
ホワイトボックスFGSM攻撃によるモデルロバスト性を効果的に向上することを示す。また,移動学習モデルに対するブラックボックス攻撃手法を提案する。ホワイトボックス攻撃とブラックボックス攻撃の双方の効果を系統的に評価するために,ソースモデルからターゲットモデルへの変換可能性の評価手法を提案する。
論文参考訳（メタデータ） (2020-08-25T15:04:32Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。