Fugu-MT 論文翻訳(概要): Repurposing and Evaluating the (In)Feasibility of Dataset Poisoning enabled Watermarking for Contrastive Learning

論文の概要: Repurposing and Evaluating the (In)Feasibility of Dataset Poisoning enabled Watermarking for Contrastive Learning

arxiv url: http://arxiv.org/abs/2605.01834v1
Date: Sun, 03 May 2026 12:01:26 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-05 20:33:49.958256
Title: Repurposing and Evaluating the (In)Feasibility of Dataset Poisoning enabled Watermarking for Contrastive Learning
Title（参考訳）: コントラスト学習のための(In)データセット中毒対応型透かしの再利用と評価
Authors: Zhiyang Dai, Yansong Gao, Boyu Kuang, Haodong Li, Qi Chang, Gaurav Varshney, Derek Abbott, Anmin Fu,
Abstract要約: コントラスト学習(CL)は、自動からの監視信号によるアノテーションのコストを削減する。近年の研究では、CLモデルはデータ汚染によるバックドア攻撃に弱いことが示されているが、その一般化とロバスト性は過小評価されている。我々は、CLに対する既存のデータ収集バックドア攻撃を評価し、データセット適応性の低下、成功率の低下、ポータビリティの制限、制約のある仮定を明らかにした。
参考スコア（独自算出の注目度）: 14.983235053435138
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Contrastive learning (CL) reduces annotation cost via auto-derived supervisory signals. Since large-scale in-house CL datasets are infeasible, reliance on third-party or internet data is common. Recent studies show CL models are vulnerable to data-poisoning backdoor attacks, but their generalization and robustness are underexplored. We systematically evaluate existing data-poisoning backdoor attacks on CL, revealing limitations: poor dataset adaptability, low success rates, limited portability, and restrictive assumptions (e.g., downstream task knowledge). Interestingly, trigger samples exhibit distinguishable statistical divergence from clean samples, which inspires repurposing it as a watermark for dataset IP protection. Direct repurposing is challenging due to low success rates; we overcome this by statistical verification using a unified density metric. We further propose a multi-level watermarking scheme adapting to feature-level, soft-label, or hard-label outputs in CL. Experiments show some backdoor attacks can be repurposed as effective watermarks with trade-offs among fidelity, verifiability, and robustness. This work demonstrates weak backdoor effects become reliable signals for dataset IP protection in challenging CL settings.
Abstract（参考訳）: コントラスト学習(CL)は、自動からの監視信号によるアノテーションのコストを削減する。大規模な社内CLデータセットは実現不可能であるため、サードパーティやインターネットデータへの依存が一般的である。近年の研究では、CLモデルはデータ汚染によるバックドア攻撃に弱いことが示されているが、その一般化とロバスト性は過小評価されている。我々はCLに対する既存のデータ収集バックドア攻撃を体系的に評価し、データセット適応性の低さ、成功率の低さ、ポータビリティの制限、制約のある仮定(ダウンストリームタスク知識など)といった制限を明らかにした。興味深いことに、トリガーサンプルはクリーンサンプルと区別可能な統計的差異を示しており、データセットのIP保護のための透かしとして再利用するきっかけとなっている。直接再資源化は成功率の低いため困難であり、統一密度計量を用いた統計的検証によってこれを克服する。さらに,CLにおける特徴レベル,ソフトラベル,ハードラベルの出力に適応するマルチレベル透かし方式を提案する。実験により、バックドア攻撃は、忠実さ、検証可能性、堅牢性の間のトレードオフを伴う効果的な透かしとして再利用できることが示された。この研究は、弱いバックドア効果がCL設定に挑戦する際のデータセットIP保護のための信頼性の高い信号になることを示す。

論文の概要: Repurposing and Evaluating the (In)Feasibility of Dataset Poisoning enabled Watermarking for Contrastive Learning

関連論文リスト