Related papers: FerretNet: Efficient Synthetic Image Detection via Local Pixel Dependencies

FerretNet: Efficient Synthetic Image Detection via Local Pixel Dependencies

URL: http://arxiv.org/abs/2509.20890v2
Date: Thu, 23 Oct 2025 05:06:59 GMT
Title: FerretNet: Efficient Synthetic Image Detection via Local Pixel Dependencies
Authors: Shuqiao Liang, Jian Liu, Renzhang Chen, Quanlong Guan,
Abstract summary: FerretNet is a lightweight neural network with only 1.1M parameters that delivers efficient and robust synthetic image detection.<n>FerretNet, trained exclusively on the 4-class ProGAN dataset, achieves an average accuracy of 97.1% on an open-world benchmark comprising 22 generative models.
Score: 10.597504007889063
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The increasing realism of synthetic images generated by advanced models such as VAEs, GANs, and LDMs poses significant challenges for synthetic image detection. To address this issue, we explore two artifact types introduced during the generation process: (1) latent distribution deviations and (2) decoding-induced smoothing effects, which manifest as inconsistencies in local textures, edges, and color transitions. Leveraging local pixel dependencies (LPD) properties rooted in Markov Random Fields, we reconstruct synthetic images using neighboring pixel information to expose disruptions in texture continuity and edge coherence. Building upon LPD, we propose FerretNet, a lightweight neural network with only 1.1M parameters that delivers efficient and robust synthetic image detection. Extensive experiments demonstrate that FerretNet, trained exclusively on the 4-class ProGAN dataset, achieves an average accuracy of 97.1% on an open-world benchmark comprising 22 generative models. Our code and datasets are publicly available at https://github.com/xigua7105/FerretNet.

Related papers

CO-SPY: Combining Semantic and Pixel Features to Detect Synthetic Images by AI [58.35348718345307]
Current efforts to distinguish between real and AI-generated images may lack generalization.<n>We propose a novel framework, Co-Spy, that first enhances existing semantic features.<n>We also create Co-Spy-Bench, a comprehensive dataset comprising 5 real image datasets and 22 state-of-the-art generative models.
arXiv Detail & Related papers (2025-03-24T01:59:29Z)
bit2bit: 1-bit quanta video reconstruction via self-supervised photon prediction [57.199618102578576]
We propose bit2bit, a new method for reconstructing high-quality image stacks at original resolution from sparse binary quantatemporal image data. Inspired by recent work on Poisson denoising, we developed an algorithm that creates a dense image sequence from sparse binary photon data. We present a novel dataset containing a wide range of real SPAD high-speed videos under various challenging imaging conditions.
arXiv Detail & Related papers (2024-10-30T17:30:35Z)
Improving Synthetic Image Detection Towards Generalization: An Image Transformation Perspective [45.210030086193775]
Current synthetic image detection (SID) pipelines are primarily dedicated to crafting universal artifact features.<n>We propose SAFE, a lightweight and effective detector with three simple image transformations.<n>Our pipeline achieves a new state-of-the-art performance, with remarkable improvements of 4.5% in accuracy and 2.9% in average precision against existing methods.
arXiv Detail & Related papers (2024-08-13T09:01:12Z)
ImageNet-D: Benchmarking Neural Network Robustness on Diffusion Synthetic Object [78.58860252442045]
We introduce generative model as a data source for hard images that benchmark deep models' robustness. We are able to generate images with more diversified backgrounds, textures, and materials than any prior work, where we term this benchmark as ImageNet-D. Our work suggests that diffusion models can be an effective source to test vision models.
arXiv Detail & Related papers (2024-03-27T17:23:39Z)
Pixel-Inconsistency Modeling for Image Manipulation Localization [59.968362815126326]
Digital image forensics plays a crucial role in image authentication and manipulation localization. This paper presents a generalized and robust manipulation localization model through the analysis of pixel inconsistency artifacts. Experiments show that our method successfully extracts inherent pixel-inconsistency forgery fingerprints.
arXiv Detail & Related papers (2023-09-30T02:54:51Z)
Randomize to Generalize: Domain Randomization for Runway FOD Detection [1.4249472316161877]
Tiny Object Detection is challenging due to small size, low resolution, occlusion, background clutter, lighting conditions and small object-to-image ratio. We propose a novel two-stage methodology Synthetic Image Augmentation (SRIA) to enhance generalization capabilities of models encountering 2D datasets. We report that detection accuracy improved from an initial 41% to 92% for OOD test set.
arXiv Detail & Related papers (2023-09-23T05:02:31Z)
Parents and Children: Distinguishing Multimodal DeepFakes from Natural Images [60.34381768479834]
Recent advancements in diffusion models have enabled the generation of realistic deepfakes from textual prompts in natural language. We pioneer a systematic study on deepfake detection generated by state-of-the-art diffusion models.
arXiv Detail & Related papers (2023-04-02T10:25:09Z)
Explore the Power of Synthetic Data on Few-shot Object Detection [27.26215175101865]
Few-shot object detection (FSOD) aims to expand an object detector for novel categories given only a few instances for training. Recent text-to-image generation models have shown promising results in generating high-quality images. This work extensively studies how synthetic images generated from state-of-the-art text-to-image generators benefit FSOD tasks.
arXiv Detail & Related papers (2023-03-23T12:34:52Z)
Enhanced Sharp-GAN For Histopathology Image Synthesis [63.845552349914186]
Histopathology image synthesis aims to address the data shortage issue in training deep learning approaches for accurate cancer detection. We propose a novel approach that enhances the quality of synthetic images by using nuclei topology and contour regularization. The proposed approach outperforms Sharp-GAN in all four image quality metrics on two datasets.
arXiv Detail & Related papers (2023-01-24T17:54:01Z)
RTMV: A Ray-Traced Multi-View Synthetic Dataset for Novel View Synthesis [104.53930611219654]
We present a large-scale synthetic dataset for novel view synthesis consisting of 300k images rendered from nearly 2000 complex scenes. The dataset is orders of magnitude larger than existing synthetic datasets for novel view synthesis. Using 4 distinct sources of high-quality 3D meshes, the scenes of our dataset exhibit challenging variations in camera views, lighting, shape, materials, and textures.
arXiv Detail & Related papers (2022-05-14T13:15:32Z)
Image Fine-grained Inpainting [89.17316318927621]
We present a one-stage model that utilizes dense combinations of dilated convolutions to obtain larger and more effective receptive fields. To better train this efficient generator, except for frequently-used VGG feature matching loss, we design a novel self-guided regression loss. We also employ a discriminator with local and global branches to ensure local-global contents consistency.
arXiv Detail & Related papers (2020-02-07T03:45:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.