Fugu-MT 論文翻訳(概要): The Courtroom Trial of Pixels: Robust Image Manipulation Localization via Adversarial Evidence and Reinforcement Learning Judgment

論文の概要: The Courtroom Trial of Pixels: Robust Image Manipulation Localization via Adversarial Evidence and Reinforcement Learning Judgment

arxiv url: http://arxiv.org/abs/2604.14703v1
Date: Thu, 16 Apr 2026 07:09:59 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-17 21:29:31.770968
Title: The Courtroom Trial of Pixels: Robust Image Manipulation Localization via Adversarial Evidence and Reinforcement Learning Judgment
Title（参考訳）: レンズのコートルームトライアル:敵対的証拠と強化学習判断によるロバスト画像操作のローカライゼーション
Authors: Songlin Li, Zhiqing Guo, Dan Ma, Changtao Miao, Gaobo Yang,
Abstract要約: 我々は,IMLタスクを証拠の対決とみなす法廷スタイルのIMLフレームワークを提案する。我々は,不確実な地域で戦略的再推論と改良を行う強化学習モデルを開発した。実験結果から,本モデルはSOTA IML法と比較して平均性能が優れていることがわかった。
参考スコア（独自算出の注目度）: 15.520850734569564
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Although some existing image manipulation localization (IML) methods incorporate authenticity-related supervision, this information is typically utilized merely as an auxiliary training signal to enhance the model's sensitivity to manipulation artifacts, rather than being explicitly modeled as localization evidence opposing the manipulated regions. Consequently, when manipulation traces are subtle or degraded by post-processing and noise, these methods struggle to explicitly compare manipulated and authentic evidence, resulting in unreliable predictions in ambiguous areas. To address these issues, we propose a courtroom-style adjudication framework that regards IML task as the confrontation of evidence followed by judgment. The framework comprises a prosecution stream, a defense stream, and a judge model. We first build a dual-hypothesis segmentation architecture on a shared multi-scale encoder, in which the prosecution stream asserts manipulation and the defense stream asserts authenticity. Guided by edge priors, it produces evidence for manipulated and authentic regions through cascaded multi-level fusion, bidirectional disagreement suppression, and dynamic debate refinement. We further develop a reinforcement learning judge model that performs strategic re-inference and refinement on uncertain regions, yielding a manipulated-region mask. The judge model is trained with advantage-based rewards and a soft-IoU objective, and reliability is calibrated via entropy and cross-hypothesis consistency. Experimental results show that our model achieves superior average performance compared with SOTA IML methods.
Abstract（参考訳）: 既存の画像操作のローカライゼーション(IML)手法には、認証関連の監督が組み込まれているが、この情報は、操作された領域に反するローカライゼーションの証拠として明示的にモデル化されるのではなく、単に補助的なトレーニング信号として活用されるのが一般的である。結果として、操作トレースが後処理やノイズによって微妙に劣化した場合、これらの手法は、操作された証拠と認証された証拠を明示的に比較するのに苦労し、不明瞭な領域における信頼性の低い予測をもたらす。これらの問題に対処するため、我々は、IMLタスクを証拠の対決と判断する裁判所スタイルの判断枠組みを提案する。フレームワークは、訴追ストリーム、防衛ストリーム、および審査モデルを含む。まず、共用マルチスケールエンコーダ上に二重ハイブリッドセグメンテーションアーキテクチャを構築し、そこでは、訴追ストリームが操作を主張し、防御ストリームが認証を主張する。エッジ先駆者によって導かれ、カスケード型多層核融合、双方向不一致抑制、動的議論改善を通じて、操作された領域と認証された領域の証拠を生成する。さらに,不確実な領域における戦略的再推論と改良を行い,操作された領域マスクを生成する強化学習判断モデルを構築した。審査モデルは、利点に基づく報酬とソフトIoU目標で訓練され、信頼性はエントロピーとクロスハイプセシスの整合性によって調整される。実験結果から,本モデルはSOTA IML法と比較して平均性能が優れていることがわかった。

論文の概要: The Courtroom Trial of Pixels: Robust Image Manipulation Localization via Adversarial Evidence and Reinforcement Learning Judgment

関連論文リスト