Related papers: Are Visual Recognition Models Robust to Image Compression?

Are Visual Recognition Models Robust to Image Compression?

URL: http://arxiv.org/abs/2304.04518v1
Date: Mon, 10 Apr 2023 11:30:11 GMT
Title: Are Visual Recognition Models Robust to Image Compression?
Authors: Jo\~ao Maria Janeiro, Stanislav Frolov, Alaaeldin El-Nouby, Jakob Verbeek
Abstract summary: We analyze the impact of image compression on visual recognition tasks. We consider a wide range of compression levels, ranging from 0.1 to 2 bits-per-pixel (bpp) We find that for all three tasks, the recognition ability is significantly impacted when using strong compression.
Score: 23.280147529096908
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Reducing the data footprint of visual content via image compression is essential to reduce storage requirements, but also to reduce the bandwidth and latency requirements for transmission. In particular, the use of compressed images allows for faster transfer of data, and faster response times for visual recognition in edge devices that rely on cloud-based services. In this paper, we first analyze the impact of image compression using traditional codecs, as well as recent state-of-the-art neural compression approaches, on three visual recognition tasks: image classification, object detection, and semantic segmentation. We consider a wide range of compression levels, ranging from 0.1 to 2 bits-per-pixel (bpp). We find that for all three tasks, the recognition ability is significantly impacted when using strong compression. For example, for segmentation mIoU is reduced from 44.5 to 30.5 mIoU when compressing to 0.1 bpp using the best compression model we evaluated. Second, we test to what extent this performance drop can be ascribed to a loss of relevant information in the compressed image, or to a lack of generalization of visual recognition models to images with compression artefacts. We find that to a large extent the performance loss is due to the latter: by finetuning the recognition models on compressed training images, most of the performance loss is recovered. For example, bringing segmentation accuracy back up to 42 mIoU, i.e. recovering 82% of the original drop in accuracy.

Related papers

Embedding Compression Distortion in Video Coding for Machines [67.97469042910855]
Currently, video transmission serves not only the Human Visual System (HVS) for viewing but also machine perception for analysis. We propose a Compression Distortion Embedding (CDRE) framework, which extracts machine-perception-related distortion representation and embeds it into downstream models. Our framework can effectively boost the rate-task performance of existing codecs with minimal overhead in terms of execution time, and number of parameters.
arXiv Detail & Related papers (2025-03-27T13:01:53Z)
The Effect of Lossy Compression on 3D Medical Images Segmentation with Deep Learning [39.97900702763419]
We show that lossy compression up to 20 times have no negative impact on segmentation quality with deep neural networks (DNN) In addition, we demonstrate the ability of DNN models trained on compressed data to predict on uncompressed data and vice versa with no quality deterioration.
arXiv Detail & Related papers (2024-09-25T08:31:37Z)
CompaCT: Fractal-Based Heuristic Pixel Segmentation for Lossless Compression of High-Color DICOM Medical Images [0.0]
Medical images require a high color depth of 12 bits per pixel component for accurate analysis by physicians. Standard-based compression of images via filtering is well-known; however, it remains suboptimal in the medical domain due to non-specialized implementations. This study proposes a medical image compression algorithm, CompaCT, that aims to target spatial features and patterns of pixel concentration for dynamically enhanced data processing.
arXiv Detail & Related papers (2023-08-24T21:43:04Z)
Crowd Counting on Heavily Compressed Images with Curriculum Pre-Training [90.76576712433595]
Applying lossy compression on images processed by deep neural networks can lead to significant accuracy degradation. Inspired by the curriculum learning paradigm, we present a novel training approach called curriculum pre-training (CPT) for crowd counting on compressed images.
arXiv Detail & Related papers (2022-08-15T08:43:21Z)
Analysis of the Effect of Low-Overhead Lossy Image Compression on the Performance of Visual Crowd Counting for Smart City Applications [78.55896581882595]
Lossy image compression techniques can reduce the quality of the images, leading to accuracy degradation. In this paper, we analyze the effect of applying low-overhead lossy image compression methods on the accuracy of visual crowd counting.
arXiv Detail & Related papers (2022-07-20T19:20:03Z)
Estimating the Resize Parameter in End-to-end Learned Image Compression [50.20567320015102]
We describe a search-free resizing framework that can further improve the rate-distortion tradeoff of recent learned image compression models. Our results show that our new resizing parameter estimation framework can provide Bjontegaard-Delta rate (BD-rate) improvement of about 10% against leading perceptual quality engines.
arXiv Detail & Related papers (2022-04-26T01:35:02Z)
Identity Preserving Loss for Learned Image Compression [0.0]
This work proposes an end-to-end image compression framework that learns domain-specific features to achieve higher compression ratios. We present a novel Identity Preserving Reconstruction (IPR) loss function which achieves Bits-Per-Pixel (BPP) values that are 38% and 42% of CRF-23 HEVC compression. We show at-par recognition performance on the LFW dataset with an unseen recognition model while retaining a lower BPP value of 38% of CRF-23 HEVC compression.
arXiv Detail & Related papers (2022-04-22T18:01:01Z)
Towards Compact CNNs via Collaborative Compression [166.86915086497433]
We propose a Collaborative Compression scheme, which joints channel pruning and tensor decomposition to compress CNN models. We achieve 52.9% FLOPs reduction by removing 48.4% parameters on ResNet-50 with only a Top-1 accuracy drop of 0.56% on ImageNet 2012.
arXiv Detail & Related papers (2021-05-24T12:07:38Z)
Analyzing and Mitigating JPEG Compression Defects in Deep Learning [69.04777875711646]
We present a unified study of the effects of JPEG compression on a range of common tasks and datasets. We show that there is a significant penalty on common performance metrics for high compression.
arXiv Detail & Related papers (2020-11-17T20:32:57Z)
Distributed Learning and Inference with Compressed Images [40.07509530656681]
This paper focuses on vision-based perception for autonomous driving as a paradigmatic scenario. We propose dataset restoration, based on image restoration with generative adversarial networks (GANs) Our method is agnostic to both the particular image compression method and the downstream task.
arXiv Detail & Related papers (2020-04-22T11:20:53Z)
Discernible Image Compression [124.08063151879173]
This paper aims to produce compressed images by pursuing both appearance and perceptual consistency. Based on the encoder-decoder framework, we propose using a pre-trained CNN to extract features of the original and compressed images. Experiments on benchmarks demonstrate that images compressed by using the proposed method can also be well recognized by subsequent visual recognition and detection models.
arXiv Detail & Related papers (2020-02-17T07:35:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.