Related papers: Rate-Distortion in Image Coding for Machines

Rate-Distortion in Image Coding for Machines

URL: http://arxiv.org/abs/2209.11694v1
Date: Wed, 21 Sep 2022 20:24:14 GMT
Title: Rate-Distortion in Image Coding for Machines
Authors: Alon Harell, Anderson De Andrade, and Ivan V. Bajic
Abstract summary: In many applications, such as surveillance, images are mostly transmitted for automated analysis, and rarely seen by humans. Traditional compression for this scenario has been shown to be inefficient in terms of bit-rate, likely due to the focus on human based distortion metrics. One way to create the machine side of such a scalable model is to perform feature matching of some intermediate layer in a Deep Neural Network performing the machine task.
Score: 26.32381277880991
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In recent years, there has been a sharp increase in transmission of images to remote servers specifically for the purpose of computer vision. In many applications, such as surveillance, images are mostly transmitted for automated analysis, and rarely seen by humans. Using traditional compression for this scenario has been shown to be inefficient in terms of bit-rate, likely due to the focus on human based distortion metrics. Thus, it is important to create specific image coding methods for joint use by humans and machines. One way to create the machine side of such a codec is to perform feature matching of some intermediate layer in a Deep Neural Network performing the machine task. In this work, we explore the effects of the layer choice used in training a learnable codec for humans and machines. We prove, using the data processing inequality, that matching features from deeper layers is preferable in the sense of rate-distortion. Next, we confirm our findings empirically by re-training an existing model for scalable human-machine coding. In our experiments we show the trade-off between the human and machine sides of such a scalable model, and discuss the benefit of using deeper layers for training in that regard.

Related papers

Guided Diffusion for the Extension of Machine Vision to Human Visual Perception [0.0]
We propose a method for extending machine vision to human visual perception using guided diffusion. Guided diffusion acts as a bridge between machine vision and human perception, enabling transitions between them without any additional overhead.
arXiv Detail & Related papers (2025-03-23T03:04:26Z)
Predicting Satisfied User and Machine Ratio for Compressed Images: A Unified Approach [58.71009078356928]
We create a deep learning-based model to predict Satisfied User Ratio (SUR) and Satisfied Machine Ratio (SMR) of compressed images simultaneously. Experimental results indicate that the proposed model significantly outperforms state-of-the-art SUR and SMR prediction methods.
arXiv Detail & Related papers (2024-12-23T11:09:30Z)
Gen-SIS: Generative Self-augmentation Improves Self-supervised Learning [52.170253590364545]
Gen-SIS is a diffusion-based augmentation technique trained exclusively on unlabeled image data. We show that these self-augmentations', i.e. generative augmentations based on the vanilla SSL encoder embeddings, facilitate the training of a stronger SSL encoder.
arXiv Detail & Related papers (2024-12-02T16:20:59Z)
Exploring Compressed Image Representation as a Perceptual Proxy: A Study [1.0878040851638]
We propose an end-to-end learned image compression wherein the analysis transform is jointly trained with an object classification task. This study affirms that the compressed latent representation can predict human perceptual distance judgments with an accuracy comparable to a custom-tailored DNN-based quality metric.
arXiv Detail & Related papers (2024-01-14T04:37:17Z)
Human-imperceptible, Machine-recognizable Images [76.01951148048603]
A major conflict is exposed relating to software engineers between better developing AI systems and distancing from the sensitive training data. This paper proposes an efficient privacy-preserving learning paradigm, where images are encrypted to become human-imperceptible, machine-recognizable'' We show that the proposed paradigm can ensure the encrypted images have become human-imperceptible while preserving machine-recognizable information.
arXiv Detail & Related papers (2023-06-06T13:41:37Z)
Synthetic Data for Object Classification in Industrial Applications [53.180678723280145]
In object classification, capturing a large number of images per object and in different conditions is not always possible. This work explores the creation of artificial images using a game engine to cope with limited data in the training dataset.
arXiv Detail & Related papers (2022-12-09T11:43:04Z)
Traditional Classification Neural Networks are Good Generators: They are Competitive with DDPMs and GANs [104.72108627191041]
We show that conventional neural network classifiers can generate high-quality images comparable to state-of-the-art generative models. We propose a mask-based reconstruction module to make semantic gradients-aware to synthesize plausible images. We show that our method is also applicable to text-to-image generation by regarding image-text foundation models.
arXiv Detail & Related papers (2022-11-27T11:25:35Z)
Preprocessing Enhanced Image Compression for Machine Vision [14.895698385236937]
We propose a preprocessing enhanced image compression method for machine vision tasks. Our framework is built upon the traditional non-differential codecs. Experimental results show our method achieves a better tradeoff between the coding and the performance of the downstream machine vision tasks by saving about 20%.
arXiv Detail & Related papers (2022-06-12T03:36:38Z)
A New Image Codec Paradigm for Human and Machine Uses [53.48873918537017]
A new scalable image paradigm for both human and machine uses is proposed in this work. The high-level instance segmentation map and the low-level signal features are extracted with neural networks. An image is designed and trained to achieve the general-quality image reconstruction with the 16-bit gray-scale profile and signal features.
arXiv Detail & Related papers (2021-12-19T06:17:38Z)
Image coding for machines: an end-to-end learned approach [23.92748892163087]
In this paper, we propose an image for machines which is neural network (NN) based and end-to-end learned. Our results show that our NN-based task outperforms the state-of-the-art Versa-tile Video Coding (VVC) standard on the object detection and instance segmentation tasks. To the best of our knowledge, this is the first end-to-end learned machine-targeted image distortion.
arXiv Detail & Related papers (2021-08-23T07:54:42Z)
Deep Multilabel CNN for Forensic Footwear Impression Descriptor Identification [0.9786690381850356]
We employ a deep learning approach to classify footwear impression's features known as emphdescriptors for forensic use cases. We develop and evaluate a technique for feeding downsampled greyscale impressions to a neural network pre-trained on data from a different domain.
arXiv Detail & Related papers (2021-02-09T19:39:28Z)
Swapping Autoencoder for Deep Image Manipulation [94.33114146172606]
We propose the Swapping Autoencoder, a deep model designed specifically for image manipulation. The key idea is to encode an image with two independent components and enforce that any swapped combination maps to a realistic image. Experiments on multiple datasets show that our model produces better results and is substantially more efficient compared to recent generative models.
arXiv Detail & Related papers (2020-07-01T17:59:57Z)
Towards Coding for Human and Machine Vision: A Scalable Image Coding Approach [104.02201472370801]
We come up with a novel image coding framework by leveraging both the compressive and the generative models. By introducing advanced generative models, we train a flexible network to reconstruct images from compact feature representations and the reference pixels. Experimental results demonstrate the superiority of our framework in both human visual quality and facial landmark detection.
arXiv Detail & Related papers (2020-01-09T10:37:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.