Related papers: Training CNNs in Presence of JPEG Compression: Multimedia Forensics vs Computer Vision

Training CNNs in Presence of JPEG Compression: Multimedia Forensics vs Computer Vision

URL: http://arxiv.org/abs/2009.12088v1
Date: Fri, 25 Sep 2020 08:47:21 GMT
Title: Training CNNs in Presence of JPEG Compression: Multimedia Forensics vs Computer Vision
Authors: Sara Mandelli, Nicol\`o Bonettini, Paolo Bestagini, Stefano Tubaro
Abstract summary: We focus on the effect that JPEG has on CNN training considering different computer vision and forensic image classification problems. We show that it is necessary to consider these effects when generating a training dataset in order to properly train a forensic detector.
Score: 18.3198215837364
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Convolutional Neural Networks (CNNs) have proved very accurate in multiple computer vision image classification tasks that required visual inspection in the past (e.g., object recognition, face detection, etc.). Motivated by these astonishing results, researchers have also started using CNNs to cope with image forensic problems (e.g., camera model identification, tampering detection, etc.). However, in computer vision, image classification methods typically rely on visual cues easily detectable by human eyes. Conversely, forensic solutions rely on almost invisible traces that are often very subtle and lie in the fine details of the image under analysis. For this reason, training a CNN to solve a forensic task requires some special care, as common processing operations (e.g., resampling, compression, etc.) can strongly hinder forensic traces. In this work, we focus on the effect that JPEG has on CNN training considering different computer vision and forensic image classification problems. Specifically, we consider the issues that rise from JPEG compression and misalignment of the JPEG grid. We show that it is necessary to consider these effects when generating a training dataset in order to properly train a forensic detector not losing generalization capability, whereas it is almost possible to ignore these effects for computer vision tasks.

Related papers

Is JPEG AI going to change image forensics? [50.92778618091496]
We investigate the counter-forensic effects of the new JPEG AI standard based on neural image compression. Our results demonstrate a reduction in the performance of leading forensic detectors when analyzing content processed through JPEG AI.
arXiv Detail & Related papers (2024-12-04T12:07:20Z)
Visual Context-Aware Person Fall Detection [52.49277799455569]
We present a segmentation pipeline to semi-automatically separate individuals and objects in images. Background objects such as beds, chairs, or wheelchairs can challenge fall detection systems, leading to false positive alarms. We demonstrate that object-specific contextual transformations during training effectively mitigate this challenge.
arXiv Detail & Related papers (2024-04-11T19:06:36Z)
Human-imperceptible, Machine-recognizable Images [76.01951148048603]
A major conflict is exposed relating to software engineers between better developing AI systems and distancing from the sensitive training data. This paper proposes an efficient privacy-preserving learning paradigm, where images are encrypted to become human-imperceptible, machine-recognizable'' We show that the proposed paradigm can ensure the encrypted images have become human-imperceptible while preserving machine-recognizable information.
arXiv Detail & Related papers (2023-06-06T13:41:37Z)
Pretrained ViTs Yield Versatile Representations For Medical Images [4.443013185089128]
Vision transformers (ViTs) have appeared as a competitive alternative to CNNs. We conduct a series of experiments on several standard 2D medical image benchmark datasets and tasks. Our findings show that, while CNNs perform better if trained from scratch, off-the-shelf vision transformers can perform on par with CNNs when pretrained on ImageNet.
arXiv Detail & Related papers (2023-03-13T11:53:40Z)
Forged Image Detection using SOTA Image Classification Deep Learning Methods for Image Forensics with Error Level Analysis [2.719418335747252]
Image forensics is one of the major areas of computer vision application. Forgery of images is sub-category of image forensics and can be detected using Error Level Analysis. We perform transfer learning with state-of-the-art image classification models over error level analysis induced CASIA ITDE v.2 dataset.
arXiv Detail & Related papers (2022-11-28T10:10:42Z)
Saccade Mechanisms for Image Classification, Object Detection and Tracking [12.751552698602744]
We examine how the saccade mechanism from biological vision can be used to make deep neural networks more efficient for classification and object detection problems. Our proposed approach is based on the ideas of attention-driven visual processing and saccades, miniature eye movements influenced by attention.
arXiv Detail & Related papers (2022-06-10T13:50:34Z)
Follow My Eye: Using Gaze to Supervise Computer-Aided Diagnosis [54.60796004113496]
We demonstrate that the eye movement of radiologists reading medical images can be a new form of supervision to train the DNN-based computer-aided diagnosis (CAD) system. We record the tracks of the radiologists' gaze when they are reading images. The gaze information is processed and then used to supervise the DNN's attention via an Attention Consistency module.
arXiv Detail & Related papers (2022-04-06T08:31:05Z)
BreakingBED -- Breaking Binary and Efficient Deep Neural Networks by Adversarial Attacks [65.2021953284622]
We study robustness of CNNs against white-box and black-box adversarial attacks. Results are shown for distilled CNNs, agent-based state-of-the-art pruned models, and binarized neural networks.
arXiv Detail & Related papers (2021-03-14T20:43:19Z)
The Mind's Eye: Visualizing Class-Agnostic Features of CNNs [92.39082696657874]
We propose an approach to visually interpret CNN features given a set of images by creating corresponding images that depict the most informative features of a specific layer. Our method uses a dual-objective activation and distance loss, without requiring a generator network nor modifications to the original model.
arXiv Detail & Related papers (2021-01-29T07:46:39Z)
A Transferable Anti-Forensic Attack on Forensic CNNs Using A Generative Adversarial Network [24.032025811564814]
convolutional neural networks (CNNs) have become widely used in multimedia forensics. Anti-forensic attacks have been developed to fool these CNN-based forensic algorithms. We propose a new anti-forensic attack framework designed to remove forensic traces left by a variety of manipulation operations.
arXiv Detail & Related papers (2021-01-23T19:31:59Z)
D-Unet: A Dual-encoder U-Net for Image Splicing Forgery Detection and Localization [108.8592577019391]
Image splicing forgery detection is a global binary classification task that distinguishes the tampered and non-tampered regions by image fingerprints. We propose a novel network called dual-encoder U-Net (D-Unet) for image splicing forgery detection, which employs an unfixed encoder and a fixed encoder. In an experimental comparison study of D-Unet and state-of-the-art methods, D-Unet outperformed the other methods in image-level and pixel-level detection.
arXiv Detail & Related papers (2020-12-03T10:54:02Z)
CNN-based fast source device identification [30.17213343080699]
We propose a fast and accurate solution using convolutional neural networks (CNNs) Specifically, we propose a 2-channel-based CNN that learns a way of comparing camera fingerprint and image noise at patch level. This makes the approach particularly suitable in scenarios where large databases of images are analyzed, like over social networks.
arXiv Detail & Related papers (2020-01-31T14:01:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.