Fugu-MT 論文翻訳(概要): Compositional Adversarial Training for Robust Visual Watermarking

論文の概要: Compositional Adversarial Training for Robust Visual Watermarking

arxiv url: http://arxiv.org/abs/2605.16720v1
Date: Sat, 16 May 2026 00:07:49 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-19 17:57:46.988075
Title: Compositional Adversarial Training for Robust Visual Watermarking
Title（参考訳）: ロバストな視覚透かしのための合成対向学習
Authors: Anirudh Satheesh, Michael-Andrei Panaitescu-Liess, Andrew Xu, Georgios Milis, Heng Huang, Zikui Cai, Furong Huang,
Abstract要約: 構成変換の構造化空間上のmin-max問題として透かしの堅牢性を定式化する。本稿では, 逐次微分可能な逆数学習を行うプラグインフレームワークであるComposeal Adversarial Training (CAT)を提案する。 CATは、同じ拡張予算でトレーニングされたランダム拡張ベースラインを一貫して上回る。
参考スコア（独自算出の注目度）: 74.59088755185307
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Robust watermarking is typically trained with random post-processing augmentation, but random sampling under-covers the combinatorial space of realistic attack pipelines and rarely encounters the rare compositions that actually break detection. This leads to unstable training and poor sample efficiency. We instead formulate watermark robustness as a min-max problem over a structured space of compositional transformations. We propose Compositional Adversarial Training (CAT), a plug-in framework that learns a sequential differentiable adversary that observes the current watermarked image and selects an attack family at each step to maximally disrupt message recovery. CAT combines a straight-through Gumbel-Softmax attack selection with entropy regularization, allowing the backward pass to be end-to-end differentiable and aggregate gradient information across attack families, yielding faster, smoother convergence without collapsing to a single attack mode. We evaluate CAT on post-generation watermarks VideoSeal 0.0, VideoSeal 1.0, and PixelSeal and in-generation WMAR under both single-step and two-step attack suites, on in-distribution and multiple out-of-distribution image and video benchmarks. CAT consistently outperforms random-augmentation baselines trained with the same augmentation budget, with the largest gains on hard composed attacks and OOD evaluations; improving overall watermark capacity by up to $63.5\%$ in the single-step attack setting and $13.0\%$ in the compositional setting. In the autoregressive setting, CAT improves the TPR@FPR$=1\%$ by $12\%$ on average on difficult geometric transformations. These results show that robust visual watermarking benefits from training against adaptive compositional adversaries rather than independent random corruptions.
Abstract（参考訳）: ロバストな透かしは、通常ランダムな後処理の強化で訓練されるが、ランダムなサンプリングは現実的な攻撃パイプラインの組合せ空間を覆い隠しており、検出を実際に破壊する稀な組成に遭遇することは滅多にない。これは不安定なトレーニングとサンプル効率の低下につながります。代わりに、構成変換の構造化空間上のmin-max問題としてウォーターマークロバストネスを定式化する。本稿では,現在の透かし画像を観察し,各ステップで攻撃群を選択してメッセージ回復を最大に破壊する,逐次微分可能な逆処理を学習するプラグインフレームワークであるComposeal Adversarial Training (CAT)を提案する。 CATは、ストレートスルーのGumbel-Softmax攻撃選択とエントロピー正規化を組み合わせることで、後方パスをエンド・ツー・エンドの差別化可能とし、攻撃ファミリー間で勾配情報を集約し、単一の攻撃モードに崩壊することなくより高速でスムーズな収束を得る。我々は,ポスト世代のウォーターマークであるVideoSeal 0.0,VideoSeal 1.0,PixelSealおよびPixelSealおよびWMARを,シングルステップおよび2ステップのアタック・スイート,イン・ディストリビューションおよび複数のアウト・オブ・ディストリビューション・イメージおよびビデオ・ベンチマークで評価した。 CATは、同じ拡張予算でトレーニングされたランダム拡張ベースラインを一貫して上回り、ハードコンポジション攻撃とOOD評価で最大の利益を上げ、シングルステップ攻撃設定で最大63.5\%、構成設定で最大13.0\%の全体的な透かし容量を改善する。自己回帰的な設定では、CATは難しい幾何学的変換の平均でTPR@FPR$=1\%$を12\%$に改善する。これらの結果から,無作為なランダムな汚職ではなく,適応的な構成上の敵に対するトレーニングによる堅牢な視覚的透かしの利点が示唆された。

論文の概要: Compositional Adversarial Training for Robust Visual Watermarking

関連論文リスト