Fugu-MT 論文翻訳(概要): Generalizable Face Forgery Detection via Separable Prompt Learning

論文の概要: Generalizable Face Forgery Detection via Separable Prompt Learning

arxiv url: http://arxiv.org/abs/2604.17307v1
Date: Sun, 19 Apr 2026 07:51:34 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-21 21:52:52.453488
Title: Generalizable Face Forgery Detection via Separable Prompt Learning
Title（参考訳）: 分離型プロンプト学習による一般化可能な顔偽造検出
Authors: Enrui Yang, Yuezun Li,
Abstract要約: 筆者らは,CLIPを顔偽造検知器として有効に機能させることができる,SePL(Separable Prompt Learning Strategy)を提案する。この不整合を達成するために,モダリティ間のアライメント戦略と専用目的のセットについて述べる。本手法は, クロスデータセットおよびクロスメソッド評価において, 他の手法と比較して, 競争力や性能に優れる。
参考スコア（独自算出の注目度）: 9.467877750964588
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Detecting face forgeries using CLIP has recently emerged as a promising and increasingly popular research direction. Owing to its rich visual knowledge acquired through large-scale pretraining, most existing methods typically rely on the visual encoder of CLIP, while paying limited attention to the text modality. Given the instructive nature of the text modality, we posit that it can be leveraged to instruct Deepfake detection with meticulous design. Accordingly, we shift the focus from the visual modality to the text modality and propose a new Separable Prompt Learning strategy (SePL) that enables CLIP to serve as an effective face forgery detector. The core idea of SePL is to disentangle forgery-specific and forgery-irrelevant information in images via two types of prompt learning, with the former enhancing detection. To achieve this disentangle, we describe a cross-modality alignment strategy and a set of dedicated objectives. Extensive experiments demonstrate that, with this simple adaptation, our method achieves competitive and even superior performance compared to other methods under both cross-dataset and cross-method evaluation, highlighting its strong generalizability. The codes have been released at https://github.com/OUC-YER/SePL-DeepfakeDetection
Abstract（参考訳）: CLIPを使った顔偽造者の検出は、近ごろ、有望で人気の高い研究方向として浮上した。大規模な事前学習によって得られる豊富な視覚知識のため、既存のほとんどの手法は典型的にはCLIPのビジュアルエンコーダに依存し、テキストのモダリティに限られた注意を払っている。テキストモダリティのインストラクティブな性質を考慮すると、Deepfake検出を巧妙な設計で指示することができると仮定する。そこで我々は、視覚的モダリティからテキストモダリティへと焦点を移し、CLIPが効果的な顔偽造検知器として機能することのできる、新たなセパブル・プロンプト・ラーニング戦略(SePL)を提案する。 SePLの中核となる考え方は、2種類のプロンプト学習を通じて画像中の偽情報と偽情報とをアンタングル化し、前者による検出を増強することである。この不整合を達成するために,モダリティ間のアライメント戦略と専用目的のセットについて述べる。大規模な実験により,本手法は, クロスデータセットおよびクロスメソッド評価において, 競合的かつ優れた性能を達成し, その強い一般化性を強調した。コードはhttps://github.com/OUC-YER/SePL-DeepfakeDetectionでリリースされた。

論文の概要: Generalizable Face Forgery Detection via Separable Prompt Learning

関連論文リスト