A Robust Approach Towards Distinguishing Natural and Computer Generated
Images using Multi-Colorspace fused and Enriched Vision Transformer
- URL: http://arxiv.org/abs/2308.07279v1
- Date: Mon, 14 Aug 2023 17:11:17 GMT
- Title: A Robust Approach Towards Distinguishing Natural and Computer Generated
Images using Multi-Colorspace fused and Enriched Vision Transformer
- Authors: Manjary P Gangan, Anoop Kadan, and Lajish V L
- Abstract summary: This work proposes a robust approach towards distinguishing natural and computer generated images.
The proposed approach achieves high performance gain when compared to a set of baselines.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The works in literature classifying natural and computer generated images are
mostly designed as binary tasks either considering natural images versus
computer graphics images only or natural images versus GAN generated images
only, but not natural images versus both classes of the generated images. Also,
even though this forensic classification task of distinguishing natural and
computer generated images gets the support of the new convolutional neural
networks and transformer based architectures that can give remarkable
classification accuracies, they are seen to fail over the images that have
undergone some post-processing operations usually performed to deceive the
forensic algorithms, such as JPEG compression, gaussian noise, etc. This work
proposes a robust approach towards distinguishing natural and computer
generated images including both, computer graphics and GAN generated images
using a fusion of two vision transformers where each of the transformer
networks operates in different color spaces, one in RGB and the other in YCbCr
color space. The proposed approach achieves high performance gain when compared
to a set of baselines, and also achieves higher robustness and generalizability
than the baselines. The features of the proposed model when visualized are seen
to obtain higher separability for the classes than the input image features and
the baseline features. This work also studies the attention map visualizations
of the networks of the fused model and observes that the proposed methodology
can capture more image information relevant to the forensic task of classifying
natural and generated images.
Related papers
- Unlocking Pre-trained Image Backbones for Semantic Image Synthesis [29.688029979801577]
We propose a new class of GAN discriminators for semantic image synthesis that generates highly realistic images.
Our model, which we dub DP-SIMS, achieves state-of-the-art results in terms of image quality and consistency with the input label maps on ADE-20K, COCO-Stuff, and Cityscapes.
arXiv Detail & Related papers (2023-12-20T09:39:19Z) - T-former: An Efficient Transformer for Image Inpainting [50.43302925662507]
A class of attention-based network architectures, called transformer, has shown significant performance on natural language processing fields.
In this paper, we design a novel attention linearly related to the resolution according to Taylor expansion, and based on this attention, a network called $T$-former is designed for image inpainting.
Experiments on several benchmark datasets demonstrate that our proposed method achieves state-of-the-art accuracy while maintaining a relatively low number of parameters and computational complexity.
arXiv Detail & Related papers (2023-05-12T04:10:42Z) - Using a Conditional Generative Adversarial Network to Control the
Statistical Characteristics of Generated Images for IACT Data Analysis [55.41644538483948]
We divide images into several classes according to the value of some property of the image, and then specify the required class when generating new images.
In the case of images from Imaging Atmospheric Cherenkov Telescopes (IACTs), an important property is the total brightness of all image pixels (image size)
We used a cGAN technique to generate images similar to whose obtained in the TAIGA-IACT experiment.
arXiv Detail & Related papers (2022-11-28T22:30:33Z) - Joint Learning of Deep Texture and High-Frequency Features for
Computer-Generated Image Detection [24.098604827919203]
We propose a joint learning strategy with deep texture and high-frequency features for CG image detection.
A semantic segmentation map is generated to guide the affine transformation operation.
The combination of the original image and the high-frequency components of the original and rendered images are fed into a multi-branch neural network equipped with attention mechanisms.
arXiv Detail & Related papers (2022-09-07T17:30:40Z) - Restormer: Efficient Transformer for High-Resolution Image Restoration [118.9617735769827]
convolutional neural networks (CNNs) perform well at learning generalizable image priors from large-scale data.
Transformers have shown significant performance gains on natural language and high-level vision tasks.
Our model, named Restoration Transformer (Restormer), achieves state-of-the-art results on several image restoration tasks.
arXiv Detail & Related papers (2021-11-18T18:59:10Z) - Distinguishing Natural and Computer-Generated Images using
Multi-Colorspace fused EfficientNet [0.0]
In a real-world image forensic scenario, it is highly essential to consider all categories of image generation.
We propose a Multi-Colorspace fused EfficientNet model by parallelly fusing three EfficientNet networks.
Our model outperforms the baselines in terms of accuracy, robustness towards post-processing, and generalizability towards other datasets.
arXiv Detail & Related papers (2021-10-18T15:55:45Z) - Ensembling with Deep Generative Views [72.70801582346344]
generative models can synthesize "views" of artificial images that mimic real-world variations, such as changes in color or pose.
Here, we investigate whether such views can be applied to real images to benefit downstream analysis tasks such as image classification.
We use StyleGAN2 as the source of generative augmentations and investigate this setup on classification tasks involving facial attributes, cat faces, and cars.
arXiv Detail & Related papers (2021-04-29T17:58:35Z) - CNN Detection of GAN-Generated Face Images based on Cross-Band
Co-occurrences Analysis [34.41021278275805]
Last-generation GAN models allow to generate synthetic images which are visually indistinguishable from natural ones.
We propose a method for distinguishing GAN-generated from natural images by exploiting inconsistencies among spectral bands.
arXiv Detail & Related papers (2020-07-25T10:55:04Z) - Generative Hierarchical Features from Synthesizing Images [65.66756821069124]
We show that learning to synthesize images can bring remarkable hierarchical visual features that are generalizable across a wide range of applications.
The visual feature produced by our encoder, termed as Generative Hierarchical Feature (GH-Feat), has strong transferability to both generative and discriminative tasks.
arXiv Detail & Related papers (2020-07-20T18:04:14Z) - Fine-grained Image-to-Image Transformation towards Visual Recognition [102.51124181873101]
We aim at transforming an image with a fine-grained category to synthesize new images that preserve the identity of the input image.
We adopt a model based on generative adversarial networks to disentangle the identity related and unrelated factors of an image.
Experiments on the CompCars and Multi-PIE datasets demonstrate that our model preserves the identity of the generated images much better than the state-of-the-art image-to-image transformation models.
arXiv Detail & Related papers (2020-01-12T05:26:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.