On the Holistic Approach for Detecting Human Image Forgery
- URL: http://arxiv.org/abs/2601.04715v1
- Date: Thu, 08 Jan 2026 08:33:22 GMT
- Title: On the Holistic Approach for Detecting Human Image Forgery
- Authors: Xiao Guo, Jie Zhu, Anil Jain, Xiaoming Liu,
- Abstract summary: We introduce HuForDet, a holistic framework for human image forgery detection.<n>A contextualized forgery detection branch leverages a Multi-Modal Large Language Model (MLLM) to analyze full-body semantic consistency.<n>Our HuForDet achieves state-of-the-art forgery detection performance and superior robustness across diverse human image forgeries.
- Score: 20.765860380888057
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The rapid advancement of AI-generated content (AIGC) has escalated the threat of deepfakes, from facial manipulations to the synthesis of entire photorealistic human bodies. However, existing detection methods remain fragmented, specializing either in facial-region forgeries or full-body synthetic images, and consequently fail to generalize across the full spectrum of human image manipulations. We introduce HuForDet, a holistic framework for human image forgery detection, which features a dual-branch architecture comprising: (1) a face forgery detection branch that employs heterogeneous experts operating in both RGB and frequency domains, including an adaptive Laplacian-of-Gaussian (LoG) module designed to capture artifacts ranging from fine-grained blending boundaries to coarse-scale texture irregularities; and (2) a contextualized forgery detection branch that leverages a Multi-Modal Large Language Model (MLLM) to analyze full-body semantic consistency, enhanced with a confidence estimation mechanism that dynamically weights its contribution during feature fusion. We curate a human image forgery (HuFor) dataset that unifies existing face forgery data with a new corpus of full-body synthetic humans. Extensive experiments show that our HuForDet achieves state-of-the-art forgery detection performance and superior robustness across diverse human image forgeries.
Related papers
- InpaintHuman: Reconstructing Occluded Humans with Multi-Scale UV Mapping and Identity-Preserving Diffusion Inpainting [64.42884719282323]
InpaintHuman is a novel method for generating high-fidelity, complete, and animatable avatars from occluded monocular videos.<n>Our approach employs direct pixel-level supervision to ensure identity fidelity.
arXiv Detail & Related papers (2026-01-05T13:26:02Z) - OmniDFA: A Unified Framework for Open Set Synthesis Image Detection and Few-Shot Attribution [21.62979058692505]
OmniDFA is a novel framework for AIGI that assesses the authenticity of images, and determines the origins in a few-shot manner.<n>We construct OmniFake, a large class-aware synthetic image dataset that curates $1.17$ M images from $45$ distinct generative models.<n>Experiments demonstrate that OmniDFA exhibits excellent capability in open-set attribution and achieves state-of-the-art generalization performance on AIGI detection.
arXiv Detail & Related papers (2025-09-30T02:36:40Z) - Bi-Level Optimization for Self-Supervised AI-Generated Face Detection [56.57881725223548]
We introduce a self-supervised method for AI-generated face detectors based on bi-level optimization.<n>Our detectors significantly outperform existing approaches in both one-class and binary classification settings.
arXiv Detail & Related papers (2025-07-30T16:38:29Z) - MLEP: Multi-granularity Local Entropy Patterns for Universal AI-generated Image Detection [44.40575446607237]
There is an urgent need for effective methods to detect AI-generated images (AIGI)<n>We propose Multi-granularity Local Entropy Patterns (MLEP), a set of entropy feature maps computed across shuffled small patches over multiple image scaled.<n>MLEP comprehensively captures pixel relationships across dimensions and scales while significantly disrupting image semantics, reducing potential content bias.
arXiv Detail & Related papers (2025-04-18T14:50:23Z) - Understanding and Improving Training-Free AI-Generated Image Detections with Vision Foundation Models [68.90917438865078]
Deepfake techniques for facial synthesis and editing pose serious risks for generative models.<n>In this paper, we investigate how detection performance varies across model backbones, types, and datasets.<n>We introduce Contrastive Blur, which enhances performance on facial images, and MINDER, which addresses noise type bias, balancing performance across domains.
arXiv Detail & Related papers (2024-11-28T13:04:45Z) - MFCLIP: Multi-modal Fine-grained CLIP for Generalizable Diffusion Face Forgery Detection [64.29452783056253]
The rapid development of photo-realistic face generation methods has raised significant concerns in society and academia.<n>Although existing approaches mainly capture face forgery patterns using image modality, other modalities like fine-grained noises and texts are not fully explored.<n>We propose a novel multi-modal fine-grained CLIP (MFCLIP) model, which mines comprehensive and fine-grained forgery traces across image-noise modalities.
arXiv Detail & Related papers (2024-09-15T13:08:59Z) - Towards the Detection of AI-Synthesized Human Face Images [12.090322373964124]
This paper presents a benchmark including human face images produced by Generative Adversarial Networks (GANs) and a variety of DMs.
Then, the forgery traces introduced by different generative models have been analyzed in the frequency domain to draw various insights.
The paper further demonstrates that a detector trained with frequency representation can generalize well to other unseen generative models.
arXiv Detail & Related papers (2024-02-13T19:37:44Z) - GenFace: A Large-Scale Fine-Grained Face Forgery Benchmark and Cross Appearance-Edge Learning [50.7702397913573]
The rapid advancement of photorealistic generators has reached a critical juncture where the discrepancy between authentic and manipulated images is increasingly indistinguishable.
Although there have been a number of publicly available face forgery datasets, the forgery faces are mostly generated using GAN-based synthesis technology.
We propose a large-scale, diverse, and fine-grained high-fidelity dataset, namely GenFace, to facilitate the advancement of deepfake detection.
arXiv Detail & Related papers (2024-02-03T03:13:50Z) - Exploring the Robustness of Human Parsers Towards Common Corruptions [99.89886010550836]
We construct three corruption robustness benchmarks, termed LIP-C, ATR-C, and Pascal-Person-Part-C, to assist us in evaluating the risk tolerance of human parsing models.
Inspired by the data augmentation strategy, we propose a novel heterogeneous augmentation-enhanced mechanism to bolster robustness under commonly corrupted conditions.
arXiv Detail & Related papers (2023-09-02T13:32:14Z) - Fighting deepfakes by detecting GAN DCT anomalies [0.0]
State-of-the-art algorithms employ deep neural networks to detect fake contents.
A new fast detection method able to discriminate Deepfake images with high precision is proposed.
The method is innovative, exceeds the state-of-the-art and also gives many insights in terms of explainability.
arXiv Detail & Related papers (2021-01-24T19:45:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.