Related papers: GLFF: Global and Local Feature Fusion for AI-synthesized Image Detection

GLFF: Global and Local Feature Fusion for AI-synthesized Image Detection

URL: http://arxiv.org/abs/2211.08615v7
Date: Mon, 4 Sep 2023 22:28:46 GMT
Title: GLFF: Global and Local Feature Fusion for AI-synthesized Image Detection
Authors: Yan Ju, Shan Jia, Jialing Cai, Haiying Guan, Siwei Lyu
Abstract summary: We propose a framework to learn rich and discriminative representations by combining multi-scale global features from the whole image with refined local features from informative patches for AI synthesized image detection. GLFF fuses information from two branches: the global branch to extract multi-scale semantic features and the local branch to select informative patches for detailed local artifacts extraction.
Score: 29.118321046339656
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: With the rapid development of deep generative models (such as Generative Adversarial Networks and Diffusion models), AI-synthesized images are now of such high quality that humans can hardly distinguish them from pristine ones. Although existing detection methods have shown high performance in specific evaluation settings, e.g., on images from seen models or on images without real-world post-processing, they tend to suffer serious performance degradation in real-world scenarios where testing images can be generated by more powerful generation models or combined with various post-processing operations. To address this issue, we propose a Global and Local Feature Fusion (GLFF) framework to learn rich and discriminative representations by combining multi-scale global features from the whole image with refined local features from informative patches for AI synthesized image detection. GLFF fuses information from two branches: the global branch to extract multi-scale semantic features and the local branch to select informative patches for detailed local artifacts extraction. Due to the lack of a synthesized image dataset simulating real-world applications for evaluation, we further create a challenging fake image dataset, named DeepFakeFaceForensics (DF 3 ), which contains 6 state-of-the-art generation models and a variety of post-processing techniques to approach the real-world scenarios. Experimental results demonstrate the superiority of our method to the state-of-the-art methods on the proposed DF 3 dataset and three other open-source datasets.

Related papers

MLEP: Multi-granularity Local Entropy Patterns for Universal AI-generated Image Detection [44.40575446607237]
There is an urgent need for effective methods to detect AI-generated images (AIGI) We propose Multi-granularity Local Entropy Patterns (MLEP), a set of entropy feature maps computed across shuffled small patches over multiple image scaled. MLEP comprehensively captures pixel relationships across dimensions and scales while significantly disrupting image semantics, reducing potential content bias.
arXiv Detail & Related papers (2025-04-18T14:50:23Z)
LEGION: Learning to Ground and Explain for Synthetic Image Detection [49.958951540410816]
We introduce SynthScars, a high-quality and diverse dataset consisting of 12,236 fully synthetic images with human-expert annotations. It features 4 distinct image content types, 3 categories of artifacts, and fine-grained annotations covering pixel-level segmentation, detailed textual explanations, and artifact category labels. We propose LEGION, a multimodal large language model (MLLM)-based image forgery analysis framework that integrates artifact detection, segmentation, and explanation.
arXiv Detail & Related papers (2025-03-19T14:37:21Z)
HFMF: Hierarchical Fusion Meets Multi-Stream Models for Deepfake Detection [4.908389661988192]
HFMF is a comprehensive two-stage deepfake detection framework. It integrates vision Transformers and convolutional nets through a hierarchical feature fusion mechanism. We demonstrate that our architecture achieves superior performance across diverse dataset benchmarks.
arXiv Detail & Related papers (2025-01-10T00:20:29Z)
MFCLIP: Multi-modal Fine-grained CLIP for Generalizable Diffusion Face Forgery Detection [64.29452783056253]
The rapid development of photo-realistic face generation methods has raised significant concerns in society and academia. Although existing approaches mainly capture face forgery patterns using image modality, other modalities like fine-grained noises and texts are not fully explored. We propose a novel multi-modal fine-grained CLIP (MFCLIP) model, which mines comprehensive and fine-grained forgery traces across image-noise modalities.
arXiv Detail & Related papers (2024-09-15T13:08:59Z)
Swin Transformer for Robust Differentiation of Real and Synthetic Images: Intra- and Inter-Dataset Analysis [0.0]
This study proposes a Swin Transformer-based model for accurate differentiation between natural and synthetic images. The model's performance was evaluated through intra-dataset and inter-dataset testing across three distinct datasets.
arXiv Detail & Related papers (2024-09-07T06:43:17Z)
Detecting the Undetectable: Combining Kolmogorov-Arnold Networks and MLP for AI-Generated Image Detection [0.0]
This paper presents a novel detection framework adept at robustly identifying images produced by cutting-edge generative AI models. We propose a classification system that integrates semantic image embeddings with a traditional Multilayer Perceptron (MLP)
arXiv Detail & Related papers (2024-08-18T06:00:36Z)
Contrasting Deepfakes Diffusion via Contrastive Learning and Global-Local Similarities [88.398085358514]
Contrastive Deepfake Embeddings (CoDE) is a novel embedding space specifically designed for deepfake detection. CoDE is trained via contrastive learning by additionally enforcing global-local similarities.
arXiv Detail & Related papers (2024-07-29T18:00:10Z)
GenFace: A Large-Scale Fine-Grained Face Forgery Benchmark and Cross Appearance-Edge Learning [50.7702397913573]
The rapid advancement of photorealistic generators has reached a critical juncture where the discrepancy between authentic and manipulated images is increasingly indistinguishable. Although there have been a number of publicly available face forgery datasets, the forgery faces are mostly generated using GAN-based synthesis technology. We propose a large-scale, diverse, and fine-grained high-fidelity dataset, namely GenFace, to facilitate the advancement of deepfake detection.
arXiv Detail & Related papers (2024-02-03T03:13:50Z)
DiAD: A Diffusion-based Framework for Multi-class Anomaly Detection [55.48770333927732]
We propose a Difusion-based Anomaly Detection (DiAD) framework for multi-class anomaly detection. It consists of a pixel-space autoencoder, a latent-space Semantic-Guided (SG) network with a connection to the stable diffusion's denoising network, and a feature-space pre-trained feature extractor. Experiments on MVTec-AD and VisA datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-12-11T18:38:28Z)
Pixel-Inconsistency Modeling for Image Manipulation Localization [59.968362815126326]
Digital image forensics plays a crucial role in image authentication and manipulation localization. This paper presents a generalized and robust manipulation localization model through the analysis of pixel inconsistency artifacts. Experiments show that our method successfully extracts inherent pixel-inconsistency forgery fingerprints.
arXiv Detail & Related papers (2023-09-30T02:54:51Z)
Parents and Children: Distinguishing Multimodal DeepFakes from Natural Images [60.34381768479834]
Recent advancements in diffusion models have enabled the generation of realistic deepfakes from textual prompts in natural language. We pioneer a systematic study on deepfake detection generated by state-of-the-art diffusion models.
arXiv Detail & Related papers (2023-04-02T10:25:09Z)
Fusing Global and Local Features for Generalized AI-Synthesized Image Detection [31.35052580048599]
We design a two-branch model to combine global spatial information from the whole image and local informative features from patches selected by a novel patch selection module. We collect a highly diverse dataset synthesized by 19 models with various objects and resolutions to evaluate our model.
arXiv Detail & Related papers (2022-03-26T01:55:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.