NeXT-IMDL: Build Benchmark for NeXT-Generation Image Manipulation Detection & Localization
- URL: http://arxiv.org/abs/2512.23374v1
- Date: Mon, 29 Dec 2025 11:09:35 GMT
- Title: NeXT-IMDL: Build Benchmark for NeXT-Generation Image Manipulation Detection & Localization
- Authors: Yifei Li, Haoyuan He, Yu Zheng, Bingyao Yu, Wenzhao Zheng, Lei Chen, Jie Zhou, Jiwen Lu,
- Abstract summary: NeXT-IMDL is a large-scale diagnostic benchmark designed to probe the boundaries of current detectors.<n>NeXT-IMDL categorizes AIGC-based manipulations along four fundamental axes: editing models, manipulation types, content semantics, and forgery.<n>Our experiments on 11 representative models reveal a critical insight: while these models perform well in their original settings, they exhibit systemic failures and significant performance degradation.
- Score: 67.84497768987023
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The accessibility surge and abuse risks of user-friendly image editing models have created an urgent need for generalizable, up-to-date methods for Image Manipulation Detection and Localization (IMDL). Current IMDL research typically uses cross-dataset evaluation, where models trained on one benchmark are tested on others. However, this simplified evaluation approach conceals the fragility of existing methods when handling diverse AI-generated content, leading to misleading impressions of progress. This paper challenges this illusion by proposing NeXT-IMDL, a large-scale diagnostic benchmark designed not just to collect data, but to probe the generalization boundaries of current detectors systematically. Specifically, NeXT-IMDL categorizes AIGC-based manipulations along four fundamental axes: editing models, manipulation types, content semantics, and forgery granularity. Built upon this, NeXT-IMDL implements five rigorous cross-dimension evaluation protocols. Our extensive experiments on 11 representative models reveal a critical insight: while these models perform well in their original settings, they exhibit systemic failures and significant performance degradation when evaluated under our designed protocols that simulate real-world, various generalization scenarios. By providing this diagnostic toolkit and the new findings, we aim to advance the development towards building truly robust, next-generation IMDL models.
Related papers
- From Evidence to Verdict: An Agent-Based Forensic Framework for AI-Generated Image Detection [19.240335260177382]
We introduce AIFo (Agent-based Image Forensics), a training-free framework that emulates human forensic investigation through multi-agent collaboration.<n>Unlike conventional methods, our framework employs a set of forensic tools, including reverse image search, metadata extraction, pre-trained classifiers, and VLM analysis.<n>Our comprehensive evaluation spans 6,000 images and challenges real-world scenarios, including images from modern generative platforms and diverse online sources.
arXiv Detail & Related papers (2025-10-31T18:36:49Z) - Rethinking Evaluation of Infrared Small Target Detection [105.59753496831739]
This paper introduces a hybrid-level metric incorporating pixel- and target-level performance, proposing a systematic error analysis method, and emphasizing the importance of cross-dataset evaluation.<n>An open-source toolkit has be released to facilitate standardized benchmarking.
arXiv Detail & Related papers (2025-09-21T02:45:07Z) - ForenX: Towards Explainable AI-Generated Image Detection with Multimodal Large Language Models [82.04858317800097]
We present ForenX, a novel method that not only identifies the authenticity of images but also provides explanations that resonate with human thoughts.<n>ForenX employs the powerful multimodal large language models (MLLMs) to analyze and interpret forensic cues.<n>We introduce ForgReason, a dataset dedicated to descriptions of forgery evidences in AI-generated images.
arXiv Detail & Related papers (2025-08-02T15:21:26Z) - Unsupervised Model Diagnosis [49.36194740479798]
This paper proposes Unsupervised Model Diagnosis (UMO) to produce semantic counterfactual explanations without any user guidance.
Our approach identifies and visualizes changes in semantics, and then matches these changes to attributes from wide-ranging text sources.
arXiv Detail & Related papers (2024-10-08T17:59:03Z) - IMDL-BenCo: A Comprehensive Benchmark and Codebase for Image Manipulation Detection & Localization [58.32394109377374]
IMDL-BenCo is the first comprehensive IMDL benchmark and modular framework.
It decomposes the IMDL framework into standardized, reusable components and revises the model construction pipeline.
It includes 8 state-of-the-art IMDL models (1 of which are reproduced from scratch), 2 sets of standard training and evaluation protocols, 15 GPU-accelerated evaluation metrics, and 3 kinds of robustness evaluation.
arXiv Detail & Related papers (2024-06-15T09:44:54Z) - On quantifying and improving realism of images generated with diffusion [50.37578424163951]
We propose a metric, called Image Realism Score (IRS), computed from five statistical measures of a given image.
IRS is easily usable as a measure to classify a given image as real or fake.
We experimentally establish the model- and data-agnostic nature of the proposed IRS by successfully detecting fake images generated by Stable Diffusion Model (SDM), Dalle2, Midjourney and BigGAN.
Our efforts have also led to Gen-100 dataset, which provides 1,000 samples for 100 classes generated by four high-quality models.
arXiv Detail & Related papers (2023-09-26T08:32:55Z) - Sampling for Deep Learning Model Diagnosis (Technical Report) [5.8057675678464555]
Black-box nature of deep neural networks is a barrier to adoption in applications such as medical diagnosis.
We develop a novel data sampling technique that produce approximate but accurate results for these model debug queries.
We evaluate our techniques on one standard computer vision and one scientific data set and demonstrate that our sampling technique outperforms a variety of state-of-the-art alternatives in terms of query accuracy.
arXiv Detail & Related papers (2020-02-22T19:24:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.