Related papers: A Robust Image Forensic Framework Utilizing Multi-Colorspace Enriched Vision Transformer for Distinguishing Natural and Computer-Generated Images

A Robust Image Forensic Framework Utilizing Multi-Colorspace Enriched Vision Transformer for Distinguishing Natural and Computer-Generated Images

URL: http://arxiv.org/abs/2308.07279v2
Date: Sat, 16 Nov 2024 17:13:38 GMT
Title: A Robust Image Forensic Framework Utilizing Multi-Colorspace Enriched Vision Transformer for Distinguishing Natural and Computer-Generated Images
Authors: Manjary P. Gangan, Anoop Kadan, Lajish V L,
Abstract summary: We propose a robust forensic classifier framework leveraging enriched vision transformers to distinguish between natural and generated images. Our approach outperforms baselines, demonstrating 94.25% test accuracy with significant performance gains in individual class accuracies. This work advances the state-of-the-art in image forensics by providing a generalized and resilient solution to distinguish between natural and generated images.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The digital image forensics based research works in literature classifying natural and computer generated images primarily focuses on binary tasks. These tasks typically involve the classification of natural images versus computer graphics images only or natural images versus GAN generated images only, but not natural images versus both types of generated images simultaneously. Furthermore, despite the support of advanced convolutional neural networks and transformer based architectures that can achieve impressive classification accuracies for this forensic classification task of distinguishing natural and computer generated images, these models are seen to fail over the images that have undergone post-processing operations intended to deceive forensic algorithms, such as JPEG compression, Gaussian noise addition, etc. In this digital image forensic based work to distinguish between natural and computer-generated images encompassing both computer graphics and GAN generated images, we propose a robust forensic classifier framework leveraging enriched vision transformers. By employing a fusion approach for the networks operating in RGB and YCbCr color spaces, we achieve higher classification accuracy and robustness against the post-processing operations of JPEG compression and addition of Gaussian noise. Our approach outperforms baselines, demonstrating 94.25% test accuracy with significant performance gains in individual class accuracies. Visualizations of feature representations and attention maps reveal improved separability as well as improved information capture relevant to the forensic task. This work advances the state-of-the-art in image forensics by providing a generalized and resilient solution to distinguish between natural and generated images.

Related papers

Novel computational workflows for natural and biomedical image processing based on hypercomplex algebras [49.81327385913137]
Hypercomplex image processing extends conventional techniques in a unified paradigm encompassing algebraic and geometric principles. This workleverages quaternions and the two-dimensional planes split framework (splitting of a quaternion - representing a pixel - into pairs of 2D planes) for natural/biomedical image analysis. The proposed can regulate color appearance (e.g. with alternative renditions and grayscale conversion) and image contrast, be part of automated image processing pipelines.
arXiv Detail & Related papers (2025-02-11T18:38:02Z)
Is JPEG AI going to change image forensics? [50.92778618091496]
We investigate the counter-forensic effects of the new JPEG AI standard based on neural image compression. Our results demonstrate a reduction in the performance of leading forensic detectors when analyzing content processed through JPEG AI.
arXiv Detail & Related papers (2024-12-04T12:07:20Z)
Image-GS: Content-Adaptive Image Representation via 2D Gaussians [55.15950594752051]
We propose Image-GS, a content-adaptive image representation. Using anisotropic 2D Gaussians as the basis, Image-GS shows high memory efficiency, supports fast random access, and offers a natural level of detail stack. General efficiency and fidelity of Image-GS are validated against several recent neural image representations and industry-standard texture compressors. We hope this research offers insights for developing new applications that require adaptive quality and resource control, such as machine perception, asset streaming, and content generation.
arXiv Detail & Related papers (2024-07-02T00:45:21Z)
Transferable Learned Image Compression-Resistant Adversarial Perturbations [66.46470251521947]
Adversarial attacks can readily disrupt the image classification system, revealing the vulnerability of DNN-based recognition tasks. We introduce a new pipeline that targets image classification models that utilize learned image compressors as pre-processing modules.
arXiv Detail & Related papers (2024-01-06T03:03:28Z)
Towards Exploring Fairness in Visual Transformer based Natural and GAN Image Detection Systems [0.0]
This study explores bias in visual transformer based image forensic algorithms that classify natural and GAN images. The proposed study procures bias evaluation corpora to analyze bias in gender, racial, affective, and intersectional domains. It also analyzes the impact of image compression on model bias.
arXiv Detail & Related papers (2023-10-18T16:13:22Z)
Joint Learning of Deep Texture and High-Frequency Features for Computer-Generated Image Detection [24.098604827919203]
We propose a joint learning strategy with deep texture and high-frequency features for CG image detection. A semantic segmentation map is generated to guide the affine transformation operation. The combination of the original image and the high-frequency components of the original and rendered images are fed into a multi-branch neural network equipped with attention mechanisms.
arXiv Detail & Related papers (2022-09-07T17:30:40Z)
Distinguishing Natural and Computer-Generated Images using Multi-Colorspace fused EfficientNet [0.0]
In a real-world image forensic scenario, it is highly essential to consider all categories of image generation. We propose a Multi-Colorspace fused EfficientNet model by parallelly fusing three EfficientNet networks. Our model outperforms the baselines in terms of accuracy, robustness towards post-processing, and generalizability towards other datasets.
arXiv Detail & Related papers (2021-10-18T15:55:45Z)
Ensembling with Deep Generative Views [72.70801582346344]
generative models can synthesize "views" of artificial images that mimic real-world variations, such as changes in color or pose. Here, we investigate whether such views can be applied to real images to benefit downstream analysis tasks such as image classification. We use StyleGAN2 as the source of generative augmentations and investigate this setup on classification tasks involving facial attributes, cat faces, and cars.
arXiv Detail & Related papers (2021-04-29T17:58:35Z)
CNN Detection of GAN-Generated Face Images based on Cross-Band Co-occurrences Analysis [34.41021278275805]
Last-generation GAN models allow to generate synthetic images which are visually indistinguishable from natural ones. We propose a method for distinguishing GAN-generated from natural images by exploiting inconsistencies among spectral bands.
arXiv Detail & Related papers (2020-07-25T10:55:04Z)
Generative Hierarchical Features from Synthesizing Images [65.66756821069124]
We show that learning to synthesize images can bring remarkable hierarchical visual features that are generalizable across a wide range of applications. The visual feature produced by our encoder, termed as Generative Hierarchical Feature (GH-Feat), has strong transferability to both generative and discriminative tasks.
arXiv Detail & Related papers (2020-07-20T18:04:14Z)
Pathological Retinal Region Segmentation From OCT Images Using Geometric Relation Based Augmentation [84.7571086566595]
We propose improvements over previous GAN-based medical image synthesis methods by jointly encoding the intrinsic relationship of geometry and shape. The proposed method outperforms state-of-the-art segmentation methods on the public RETOUCH dataset having images captured from different acquisition procedures.
arXiv Detail & Related papers (2020-03-31T11:50:43Z)
Discernible Image Compression [124.08063151879173]
This paper aims to produce compressed images by pursuing both appearance and perceptual consistency. Based on the encoder-decoder framework, we propose using a pre-trained CNN to extract features of the original and compressed images. Experiments on benchmarks demonstrate that images compressed by using the proposed method can also be well recognized by subsequent visual recognition and detection models.
arXiv Detail & Related papers (2020-02-17T07:35:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.