Related papers: A Robust Approach Towards Distinguishing Natural and Computer Generated Images using Multi-Colorspace fused and Enriched Vision Transformer

A Robust Approach Towards Distinguishing Natural and Computer Generated Images using Multi-Colorspace fused and Enriched Vision Transformer

URL: http://arxiv.org/abs/2308.07279v1
Date: Mon, 14 Aug 2023 17:11:17 GMT
Title: A Robust Approach Towards Distinguishing Natural and Computer Generated Images using Multi-Colorspace fused and Enriched Vision Transformer
Authors: Manjary P Gangan, Anoop Kadan, and Lajish V L
Abstract summary: This work proposes a robust approach towards distinguishing natural and computer generated images. The proposed approach achieves high performance gain when compared to a set of baselines.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The works in literature classifying natural and computer generated images are mostly designed as binary tasks either considering natural images versus computer graphics images only or natural images versus GAN generated images only, but not natural images versus both classes of the generated images. Also, even though this forensic classification task of distinguishing natural and computer generated images gets the support of the new convolutional neural networks and transformer based architectures that can give remarkable classification accuracies, they are seen to fail over the images that have undergone some post-processing operations usually performed to deceive the forensic algorithms, such as JPEG compression, gaussian noise, etc. This work proposes a robust approach towards distinguishing natural and computer generated images including both, computer graphics and GAN generated images using a fusion of two vision transformers where each of the transformer networks operates in different color spaces, one in RGB and the other in YCbCr color space. The proposed approach achieves high performance gain when compared to a set of baselines, and also achieves higher robustness and generalizability than the baselines. The features of the proposed model when visualized are seen to obtain higher separability for the classes than the input image features and the baseline features. This work also studies the attention map visualizations of the networks of the fused model and observes that the proposed methodology can capture more image information relevant to the forensic task of classifying natural and generated images.

Related papers

Novel computational workflows for natural and biomedical image processing based on hypercomplex algebras [49.81327385913137]
Hypercomplex image processing extends conventional techniques in a unified paradigm encompassing algebraic and geometric principles. This workleverages quaternions and the two-dimensional planes split framework (splitting of a quaternion - representing a pixel - into pairs of 2D planes) for natural/biomedical image analysis. The proposed can regulate color appearance (e.g. with alternative renditions and grayscale conversion) and image contrast, be part of automated image processing pipelines.
arXiv Detail & Related papers (2025-02-11T18:38:02Z)
Is JPEG AI going to change image forensics? [50.92778618091496]
We investigate the counter-forensic effects of the new JPEG AI standard based on neural image compression. Our results demonstrate a reduction in the performance of leading forensic detectors when analyzing content processed through JPEG AI.
arXiv Detail & Related papers (2024-12-04T12:07:20Z)
Image-GS: Content-Adaptive Image Representation via 2D Gaussians [55.15950594752051]
We propose Image-GS, a content-adaptive image representation. Using anisotropic 2D Gaussians as the basis, Image-GS shows high memory efficiency, supports fast random access, and offers a natural level of detail stack. General efficiency and fidelity of Image-GS are validated against several recent neural image representations and industry-standard texture compressors. We hope this research offers insights for developing new applications that require adaptive quality and resource control, such as machine perception, asset streaming, and content generation.
arXiv Detail & Related papers (2024-07-02T00:45:21Z)
Transferable Learned Image Compression-Resistant Adversarial Perturbations [66.46470251521947]
Adversarial attacks can readily disrupt the image classification system, revealing the vulnerability of DNN-based recognition tasks. We introduce a new pipeline that targets image classification models that utilize learned image compressors as pre-processing modules.
arXiv Detail & Related papers (2024-01-06T03:03:28Z)
Towards Exploring Fairness in Visual Transformer based Natural and GAN Image Detection Systems [0.0]
This study explores bias in visual transformer based image forensic algorithms that classify natural and GAN images. The proposed study procures bias evaluation corpora to analyze bias in gender, racial, affective, and intersectional domains. It also analyzes the impact of image compression on model bias.
arXiv Detail & Related papers (2023-10-18T16:13:22Z)
Joint Learning of Deep Texture and High-Frequency Features for Computer-Generated Image Detection [24.098604827919203]
We propose a joint learning strategy with deep texture and high-frequency features for CG image detection. A semantic segmentation map is generated to guide the affine transformation operation. The combination of the original image and the high-frequency components of the original and rendered images are fed into a multi-branch neural network equipped with attention mechanisms.
arXiv Detail & Related papers (2022-09-07T17:30:40Z)
Distinguishing Natural and Computer-Generated Images using Multi-Colorspace fused EfficientNet [0.0]
In a real-world image forensic scenario, it is highly essential to consider all categories of image generation. We propose a Multi-Colorspace fused EfficientNet model by parallelly fusing three EfficientNet networks. Our model outperforms the baselines in terms of accuracy, robustness towards post-processing, and generalizability towards other datasets.
arXiv Detail & Related papers (2021-10-18T15:55:45Z)
Ensembling with Deep Generative Views [72.70801582346344]
generative models can synthesize "views" of artificial images that mimic real-world variations, such as changes in color or pose. Here, we investigate whether such views can be applied to real images to benefit downstream analysis tasks such as image classification. We use StyleGAN2 as the source of generative augmentations and investigate this setup on classification tasks involving facial attributes, cat faces, and cars.
arXiv Detail & Related papers (2021-04-29T17:58:35Z)
CNN Detection of GAN-Generated Face Images based on Cross-Band Co-occurrences Analysis [34.41021278275805]
Last-generation GAN models allow to generate synthetic images which are visually indistinguishable from natural ones. We propose a method for distinguishing GAN-generated from natural images by exploiting inconsistencies among spectral bands.
arXiv Detail & Related papers (2020-07-25T10:55:04Z)
Generative Hierarchical Features from Synthesizing Images [65.66756821069124]
We show that learning to synthesize images can bring remarkable hierarchical visual features that are generalizable across a wide range of applications. The visual feature produced by our encoder, termed as Generative Hierarchical Feature (GH-Feat), has strong transferability to both generative and discriminative tasks.
arXiv Detail & Related papers (2020-07-20T18:04:14Z)
Pathological Retinal Region Segmentation From OCT Images Using Geometric Relation Based Augmentation [84.7571086566595]
We propose improvements over previous GAN-based medical image synthesis methods by jointly encoding the intrinsic relationship of geometry and shape. The proposed method outperforms state-of-the-art segmentation methods on the public RETOUCH dataset having images captured from different acquisition procedures.
arXiv Detail & Related papers (2020-03-31T11:50:43Z)
Discernible Image Compression [124.08063151879173]
This paper aims to produce compressed images by pursuing both appearance and perceptual consistency. Based on the encoder-decoder framework, we propose using a pre-trained CNN to extract features of the original and compressed images. Experiments on benchmarks demonstrate that images compressed by using the proposed method can also be well recognized by subsequent visual recognition and detection models.
arXiv Detail & Related papers (2020-02-17T07:35:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.