UniShield: An Adaptive Multi-Agent Framework for Unified Forgery Image Detection and Localization
- URL: http://arxiv.org/abs/2510.03161v1
- Date: Fri, 03 Oct 2025 16:33:05 GMT
- Title: UniShield: An Adaptive Multi-Agent Framework for Unified Forgery Image Detection and Localization
- Authors: Qing Huang, Zhipei Xu, Xuanyu Zhang, Jian Zhang,
- Abstract summary: UniShield is a novel unified system capable of detecting and localizing image forgeries across diverse domains.<n>We show that UniShield achieves state-of-the-art results, surpassing both existing unified approaches and domain-specific detectors.
- Score: 20.95924601497973
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the rapid advancements in image generation, synthetic images have become increasingly realistic, posing significant societal risks, such as misinformation and fraud. Forgery Image Detection and Localization (FIDL) thus emerges as essential for maintaining information integrity and societal security. Despite impressive performances by existing domain-specific detection methods, their practical applicability remains limited, primarily due to their narrow specialization, poor cross-domain generalization, and the absence of an integrated adaptive framework. To address these issues, we propose UniShield, the novel multi-agent-based unified system capable of detecting and localizing image forgeries across diverse domains, including image manipulation, document manipulation, DeepFake, and AI-generated images. UniShield innovatively integrates a perception agent with a detection agent. The perception agent intelligently analyzes image features to dynamically select suitable detection models, while the detection agent consolidates various expert detectors into a unified framework and generates interpretable reports. Extensive experiments show that UniShield achieves state-of-the-art results, surpassing both existing unified approaches and domain-specific detectors, highlighting its superior practicality, adaptiveness, and scalability.
Related papers
- INSIGHT: An Interpretable Neural Vision-Language Framework for Reasoning of Generative Artifacts [0.0]
Current forensic systems degrade sharply under real-world conditions.<n>Most detectors operate as opaques, offering little insight into why an image is flagged as synthetic.<n>We introduce INSIGHT, a unified framework for robust detection and transparent explanation of AI-generated images.
arXiv Detail & Related papers (2025-11-27T11:43:50Z) - From Evidence to Verdict: An Agent-Based Forensic Framework for AI-Generated Image Detection [19.240335260177382]
We introduce AIFo (Agent-based Image Forensics), a training-free framework that emulates human forensic investigation through multi-agent collaboration.<n>Unlike conventional methods, our framework employs a set of forensic tools, including reverse image search, metadata extraction, pre-trained classifiers, and VLM analysis.<n>Our comprehensive evaluation spans 6,000 images and challenges real-world scenarios, including images from modern generative platforms and diverse online sources.
arXiv Detail & Related papers (2025-10-31T18:36:49Z) - Bridging the Gap Between Ideal and Real-world Evaluation: Benchmarking AI-Generated Image Detection in Challenging Scenarios [54.07895223545793]
This paper introduces the Real-World Robustness dataset (RRDataset) for comprehensive evaluation of detection models across three dimensions.<n>RRDataset includes high-quality images from seven major scenarios.<n>We benchmarked 17 detectors and 10 vision-language models (VLMs) on RRDataset and conducted a large-scale human study.
arXiv Detail & Related papers (2025-09-11T06:15:52Z) - NS-Net: Decoupling CLIP Semantic Information through NULL-Space for Generalizable AI-Generated Image Detection [14.7077339945096]
NS-Net is a novel framework that decouples semantic information from CLIP's visual features, followed by contrastive learning to capture intrinsic distributional differences between real and generated images.<n>Experiments show that NS-Net outperforms existing state-of-the-art methods, achieving a 7.4% improvement in detection accuracy.
arXiv Detail & Related papers (2025-08-02T07:58:15Z) - FakeScope: Large Multimodal Expert Model for Transparent AI-Generated Image Forensics [66.14786900470158]
We propose FakeScope, an expert multimodal model (LMM) tailored for AI-generated image forensics.<n>FakeScope identifies AI-synthetic images with high accuracy and provides rich, interpretable, and query-driven forensic insights.<n>FakeScope achieves state-of-the-art performance in both closed-ended and open-ended forensic scenarios.
arXiv Detail & Related papers (2025-03-31T16:12:48Z) - Unveiling Zero-Space Detection: A Novel Framework for Autonomous Ransomware Identification in High-Velocity Environments [0.0]
The proposed Zero-Space Detection framework identifies latent behavioral patterns through unsupervised clustering and advanced deep learning techniques.<n>It operates effectively in high-velocity environments by integrating multi-phase filtering and ensemble learning for refined decision-making.<n> Experimental evaluation reveals high detection rates across diverse ransomware families, including LockBit, Conti, REvil, and BlackMatter.
arXiv Detail & Related papers (2025-01-22T11:41:44Z) - SIDA: Social Media Image Deepfake Detection, Localization and Explanation with Large Multimodal Model [48.547599530927926]
Synthetic images, when shared on social media, can mislead extensive audiences and erode trust in digital content.<n>We introduce the Social media Image Detection dataSet (SID-Set), which offers three key advantages.<n>We propose a new image deepfake detection, localization, and explanation framework, named SIDA.
arXiv Detail & Related papers (2024-12-05T16:12:25Z) - Understanding and Improving Training-Free AI-Generated Image Detections with Vision Foundation Models [68.90917438865078]
Deepfake techniques for facial synthesis and editing pose serious risks for generative models.<n>In this paper, we investigate how detection performance varies across model backbones, types, and datasets.<n>We introduce Contrastive Blur, which enhances performance on facial images, and MINDER, which addresses noise type bias, balancing performance across domains.
arXiv Detail & Related papers (2024-11-28T13:04:45Z) - Present and Future Generalization of Synthetic Image Detectors [0.6144680854063939]
This work conducts a systematic analysis and uses its insights to develop practical guidelines for training robust synthetic image detectors.
Model generalization capabilities are evaluated across different setups including real-world deployment conditions.
We show that while current approaches excel in specific scenarios, no single detector achieves universal effectiveness.
arXiv Detail & Related papers (2024-09-21T12:46:17Z) - Unity in Diversity: Multi-expert Knowledge Confrontation and Collaboration for Generalizable Vehicle Re-identification [60.20318058777603]
Generalizable vehicle re-identification (ReID) seeks to develop models that can adapt to unknown target domains without the need for fine-tuning or retraining.<n>Previous works have mainly focused on extracting domain-invariant features by aligning data distributions between source domains.<n>We propose a two-stage Multi-expert Knowledge Confrontation and Collaboration (MiKeCoCo) method to solve this unique problem.
arXiv Detail & Related papers (2024-07-10T04:06:39Z) - Towards Robust GAN-generated Image Detection: a Multi-view Completion
Representation [27.483031588071942]
GAN-generated image detection now becomes the first line of defense against the malicious uses of machine-synthesized image manipulations such as deepfakes.
We propose a robust detection framework based on a novel multi-view image completion representation.
We evaluate the generalization ability of our framework across six popular GANs at different resolutions and its robustness against a broad range of perturbation attacks.
arXiv Detail & Related papers (2023-06-02T08:38:02Z) - Hierarchical Forgery Classifier On Multi-modality Face Forgery Clues [61.37306431455152]
We propose a novel Hierarchical Forgery for Multi-modality Face Forgery Detection (HFC-MFFD)
The HFC-MFFD learns robust patches-based hybrid representation to enhance forgery authentication in multiple-modality scenarios.
The specific hierarchical face forgery is proposed to alleviate the class imbalance problem and further boost detection performance.
arXiv Detail & Related papers (2022-12-30T10:54:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.