Fugu-MT 論文翻訳(概要): MC-LCR: Multi-modal contrastive classification by locally correlated representations for effective face forgery detection

論文の概要: MC-LCR: Multi-modal contrastive classification by locally correlated representations for effective face forgery detection

arxiv url: http://arxiv.org/abs/2110.03290v1
Date: Thu, 7 Oct 2021 09:24:12 GMT
ステータス: 翻訳完了
システム内更新日: 2021-10-08 15:42:22.762646
Title: MC-LCR: Multi-modal contrastive classification by locally correlated representations for effective face forgery detection
Title（参考訳）: MC-LCR:顔偽造検出のための局所相関表現によるマルチモーダルコントラスト分類
Authors: Gaojian Wang, Qian Jiang, Xin Jin, Wei Li and Xiaohui Cui
Abstract要約: 局所的関連表現を用いたマルチモーダルコントラスト分類法を提案する。我々のMC-LCRは、空間領域と周波数領域の両方から真偽顔と偽顔の暗黙の局所的不一致を増幅することを目的としている。我々は最先端の性能を達成し,本手法の堅牢性と一般化を実証する。
参考スコア（独自算出の注目度）: 11.124150983521158
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: As the remarkable development of facial manipulation technologies is accompanied by severe security concerns, face forgery detection has become a recent research hotspot. Most existing detection methods train a binary classifier under global supervision to judge real or fake. However, advanced manipulations only perform small-scale tampering, posing challenges to comprehensively capture subtle and local forgery artifacts, especially in high compression settings and cross-dataset scenarios. To address such limitations, we propose a novel framework named Multi-modal Contrastive Classification by Locally Correlated Representations(MC-LCR), for effective face forgery detection. Instead of specific appearance features, our MC-LCR aims to amplify implicit local discrepancies between authentic and forged faces from both spatial and frequency domains. Specifically, we design the shallow style representation block that measures the pairwise correlation of shallow feature maps, which encodes local style information to extract more discriminative features in the spatial domain. Moreover, we make a key observation that subtle forgery artifacts can be further exposed in the patch-wise phase and amplitude spectrum and exhibit different clues. According to the complementarity of amplitude and phase information, we develop a patch-wise amplitude and phase dual attention module to capture locally correlated inconsistencies with each other in the frequency domain. Besides the above two modules, we further introduce the collaboration of supervised contrastive loss with cross-entropy loss. It helps the network learn more discriminative and generalized representations. Through extensive experiments and comprehensive studies, we achieve state-of-the-art performance and demonstrate the robustness and generalization of our method.
Abstract（参考訳）: 顔認証技術の開発に深刻なセキュリティ上の懸念が伴う中、顔の偽造検出は最近の研究ホットスポットとなっている。既存の検出メソッドの多くは、実か偽かを判断するためにグローバル監視下でバイナリ分類器をトレーニングする。しかし、高度な操作は小規模な改ざんしか行わず、特に高い圧縮設定やデータセットのシナリオにおいて、微妙で局所的な偽造品を包括的にキャプチャする課題を提起する。このような制約に対処するため,局所相関表現(MC-LCR)を用いた顔偽造検出のためのマルチモーダルコントラスト分類(Multi-modal Contrastive Classification)を提案する。我々のMC-LCRは、特定の外観特徴の代わりに、空間領域と周波数領域の両方から真偽顔と偽顔の暗黙の局所的不一致を増幅することを目的としている。具体的には,空間領域内のより識別的な特徴を抽出するために,局所的な特徴マップを符号化する浅層特徴マップのペアワイズ相関を測定する浅層スタイル表現ブロックを設計する。さらに, パッチワイド位相および振幅スペクトルにおいて, 微妙な偽造品がさらに露出し, 異なる手がかりを示すことを重要視する。振幅情報と位相情報の相補性に応じて,周波数領域で局所的に相関する不整合を捉えるパッチワイズ振幅と位相二重注意モジュールを開発した。上記の2つのモジュールに加えて、教師付きコントラスト損失とクロスエントロピー損失の協調を導入する。ネットワークはより差別的で一般化された表現を学ぶのに役立つ。広範な実験と総合的な研究を通じて,最先端のパフォーマンスを実現し,本手法の堅牢性と一般化を実証する。

論文の概要: MC-LCR: Multi-modal contrastive classification by locally correlated representations for effective face forgery detection

関連論文リスト