Related papers: CPIPS: Learning to Preserve Perceptual Distances in End-to-End Image Compression

CPIPS: Learning to Preserve Perceptual Distances in End-to-End Image Compression

URL: http://arxiv.org/abs/2310.00559v1
Date: Sun, 1 Oct 2023 03:29:21 GMT
Title: CPIPS: Learning to Preserve Perceptual Distances in End-to-End Image Compression
Authors: Chen-Hsiu Huang, Ja-Ling Wu
Abstract summary: We propose an efficient compressed representation that caters not only to human vision but also to image processing and machine vision tasks. Our proposed method, Compressed Perceptual Image Patch Similarity (CPIPS), can be derived at a minimal cost from a neural learned and computed significantly faster than LPIPS and DISTS.
Score: 1.0878040851638
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Lossy image coding standards such as JPEG and MPEG have successfully achieved high compression rates for human consumption of multimedia data. However, with the increasing prevalence of IoT devices, drones, and self-driving cars, machines rather than humans are processing a greater portion of captured visual content. Consequently, it is crucial to pursue an efficient compressed representation that caters not only to human vision but also to image processing and machine vision tasks. Drawing inspiration from the efficient coding hypothesis in biological systems and the modeling of the sensory cortex in neural science, we repurpose the compressed latent representation to prioritize semantic relevance while preserving perceptual distance. Our proposed method, Compressed Perceptual Image Patch Similarity (CPIPS), can be derived at a minimal cost from a learned neural codec and computed significantly faster than DNN-based perceptual metrics such as LPIPS and DISTS.

Related papers

Guided Diffusion for the Extension of Machine Vision to Human Visual Perception [0.0]
We propose a method for extending machine vision to human visual perception using guided diffusion. Guided diffusion acts as a bridge between machine vision and human perception, enabling transitions between them without any additional overhead.
arXiv Detail & Related papers (2025-03-23T03:04:26Z)
Semantic Ensemble Loss and Latent Refinement for High-Fidelity Neural Image Compression [58.618625678054826]
This study presents an enhanced neural compression method designed for optimal visual fidelity. We have trained our model with a sophisticated semantic ensemble loss, integrating Charbonnier loss, perceptual loss, style loss, and a non-binary adversarial loss. Our empirical findings demonstrate that this approach significantly improves the statistical fidelity of neural image compression.
arXiv Detail & Related papers (2024-01-25T08:11:27Z)
Transferable Learned Image Compression-Resistant Adversarial Perturbations [66.46470251521947]
Adversarial attacks can readily disrupt the image classification system, revealing the vulnerability of DNN-based recognition tasks. We introduce a new pipeline that targets image classification models that utilize learned image compressors as pre-processing modules.
arXiv Detail & Related papers (2024-01-06T03:03:28Z)
VNVC: A Versatile Neural Video Coding Framework for Efficient Human-Machine Vision [59.632286735304156]
It is more efficient to enhance/analyze the coded representations directly without decoding them into pixels. We propose a versatile neural video coding (VNVC) framework, which targets learning compact representations to support both reconstruction and direct enhancement/analysis.
arXiv Detail & Related papers (2023-06-19T03:04:57Z)
A Deep Learning-based Compression and Classification Technique for Whole Slide Histopathology Images [0.31498833540989407]
We build an ensemble of neural networks that enables a compressive autoencoder in a supervised fashion to retain a denser and more meaningful representation of the input histology images. We test the compressed images using transfer learning-based classifiers and show that they provide promising accuracy and classification performance.
arXiv Detail & Related papers (2023-05-11T22:20:05Z)
Convolutional Neural Network (CNN) to reduce construction loss in JPEG compression caused by Discrete Fourier Transform (DFT) [0.0]
Convolutional Neural Networks (CNN) have received more attention than most other types of deep neural networks. In this work, an effective image compression method is purposed using autoencoders.
arXiv Detail & Related papers (2022-08-26T12:46:16Z)
Preprocessing Enhanced Image Compression for Machine Vision [14.895698385236937]
We propose a preprocessing enhanced image compression method for machine vision tasks. Our framework is built upon the traditional non-differential codecs. Experimental results show our method achieves a better tradeoff between the coding and the performance of the downstream machine vision tasks by saving about 20%.
arXiv Detail & Related papers (2022-06-12T03:36:38Z)
Neural JPEG: End-to-End Image Compression Leveraging a Standard JPEG Encoder-Decoder [73.48927855855219]
We propose a system that learns to improve the encoding performance by enhancing its internal neural representations on both the encoder and decoder ends. Experiments demonstrate that our approach successfully improves the rate-distortion performance over JPEG across various quality metrics.
arXiv Detail & Related papers (2022-01-27T20:20:03Z)
A New Image Codec Paradigm for Human and Machine Uses [53.48873918537017]
A new scalable image paradigm for both human and machine uses is proposed in this work. The high-level instance segmentation map and the low-level signal features are extracted with neural networks. An image is designed and trained to achieve the general-quality image reconstruction with the 16-bit gray-scale profile and signal features.
arXiv Detail & Related papers (2021-12-19T06:17:38Z)
Image coding for machines: an end-to-end learned approach [23.92748892163087]
In this paper, we propose an image for machines which is neural network (NN) based and end-to-end learned. Our results show that our NN-based task outperforms the state-of-the-art Versa-tile Video Coding (VVC) standard on the object detection and instance segmentation tasks. To the best of our knowledge, this is the first end-to-end learned machine-targeted image distortion.
arXiv Detail & Related papers (2021-08-23T07:54:42Z)
Discernible Image Compression [124.08063151879173]
This paper aims to produce compressed images by pursuing both appearance and perceptual consistency. Based on the encoder-decoder framework, we propose using a pre-trained CNN to extract features of the original and compressed images. Experiments on benchmarks demonstrate that images compressed by using the proposed method can also be well recognized by subsequent visual recognition and detection models.
arXiv Detail & Related papers (2020-02-17T07:35:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.