Related papers: Machine Perceptual Quality: Evaluating the Impact of Severe Lossy Compression on Audio and Image Models

Machine Perceptual Quality: Evaluating the Impact of Severe Lossy Compression on Audio and Image Models

URL: http://arxiv.org/abs/2401.07957v1
Date: Mon, 15 Jan 2024 20:47:24 GMT
Title: Machine Perceptual Quality: Evaluating the Impact of Severe Lossy Compression on Audio and Image Models
Authors: Dan Jacobellis, Daniel Cummings, Neeraja J. Yadwadkar
Abstract summary: We evaluate how different approaches to lossy compression affect machine perception tasks. It is feasible to leverage compressed perceptual compression while incurring severe lossy compression. Lossy compression for pre-training can lead to degrading machine-intuitive scenarios.
Score: 1.2584276673531931
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In the field of neural data compression, the prevailing focus has been on optimizing algorithms for either classical distortion metrics, such as PSNR or SSIM, or human perceptual quality. With increasing amounts of data consumed by machines rather than humans, a new paradigm of machine-oriented compression$\unicode{x2013}$which prioritizes the retention of features salient for machine perception over traditional human-centric criteria$\unicode{x2013}$has emerged, creating several new challenges to the development, evaluation, and deployment of systems utilizing lossy compression. In particular, it is unclear how different approaches to lossy compression will affect the performance of downstream machine perception tasks. To address this under-explored area, we evaluate various perception models$\unicode{x2013}$including image classification, image segmentation, speech recognition, and music source separation$\unicode{x2013}$under severe lossy compression. We utilize several popular codecs spanning conventional, neural, and generative compression architectures. Our results indicate three key findings: (1) using generative compression, it is feasible to leverage highly compressed data while incurring a negligible impact on machine perceptual quality; (2) machine perceptual quality correlates strongly with deep similarity metrics, indicating a crucial role of these metrics in the development of machine-oriented codecs; and (3) using lossy compressed datasets, (e.g. ImageNet) for pre-training can lead to counter-intuitive scenarios where lossy compression increases machine perceptual quality rather than degrading it. To encourage engagement on this growing area of research, our code and experiments are available at: https://github.com/danjacobellis/MPQ.

Related papers

Arbitrary Ratio Feature Compression via Next Token Prediction [52.10426317889982]
Arbitrary Ratio Feature Compression (ARFC) framework supports any compression ratio with a single model.<n>ARC is an auto-regressive model that performs compression via next-gressive prediction.<n>MoS module refines the compressed tokens by utilizing multiple compression results.<n>ERGC is integrated into the training process to preserve semantic and structural relationships during compression.
arXiv Detail & Related papers (2026-02-12T02:38:57Z)
Machines Serve Human: A Novel Variable Human-machine Collaborative Compression Framework [54.49297832630979]
We set out the first successful attempt by a novel collaborative compression method based on the machine-vision-oriented compression.<n>A plug-and-play variable bit-rate strategy is also developed for machine vision tasks.<n>We propose to progressively aggregate the semantics from the machine-vision compression, whilst seamlessly tailing the diffusion prior to restore high-fidelity details for human vision.
arXiv Detail & Related papers (2025-11-12T02:50:22Z)
GANCompress: GAN-Enhanced Neural Image Compression with Binary Spherical Quantization [0.0]
GANCompress is a novel neural compression framework that combines Binary Spherical Quantization (BSQ) with Generative Adversarial Networks (GANs)<n>We show that GANCompress achieves substantial improvement in compression efficiency -- reducing file sizes by up to 100x with minimal visual distortion.
arXiv Detail & Related papers (2025-05-19T00:18:27Z)
Embedding Compression Distortion in Video Coding for Machines [67.97469042910855]
Currently, video transmission serves not only the Human Visual System (HVS) for viewing but also machine perception for analysis. We propose a Compression Distortion Embedding (CDRE) framework, which extracts machine-perception-related distortion representation and embeds it into downstream models. Our framework can effectively boost the rate-task performance of existing codecs with minimal overhead in terms of execution time, and number of parameters.
arXiv Detail & Related papers (2025-03-27T13:01:53Z)
Unlocking the Potential of Digital Pathology: Novel Baselines for Compression [31.13721473800084]
Lossy compression can introduce color and texture disparities in pathological Whole Slide Images (WSI) Deep learning models fine-tuned for perceptual quality outperform conventional compression schemes like JPEG-XL or WebP for further compression of WSI. Our study provides novel insights for the assessment of lossy compression schemes for WSI and encourages a unified evaluation of lossy compression schemes to accelerate the clinical uptake of digital pathology.
arXiv Detail & Related papers (2024-12-17T18:04:33Z)
AlphaZip: Neural Network-Enhanced Lossless Text Compression [0.0]
This paper introduces a lossless text compression approach using a Large Language Model (LLM) The method involves two key steps: first, prediction using a dense neural network architecture, such as a transformer block; second, compressing the predicted ranks with standard compression algorithms like Adaptive Huffman, LZ77, or Gzip.
arXiv Detail & Related papers (2024-09-23T14:21:06Z)
Semantic Ensemble Loss and Latent Refinement for High-Fidelity Neural Image Compression [58.618625678054826]
This study presents an enhanced neural compression method designed for optimal visual fidelity. We have trained our model with a sophisticated semantic ensemble loss, integrating Charbonnier loss, perceptual loss, style loss, and a non-binary adversarial loss. Our empirical findings demonstrate that this approach significantly improves the statistical fidelity of neural image compression.
arXiv Detail & Related papers (2024-01-25T08:11:27Z)
Are Visual Recognition Models Robust to Image Compression? [23.280147529096908]
We analyze the impact of image compression on visual recognition tasks. We consider a wide range of compression levels, ranging from 0.1 to 2 bits-per-pixel (bpp) We find that for all three tasks, the recognition ability is significantly impacted when using strong compression.
arXiv Detail & Related papers (2023-04-10T11:30:11Z)
Cross Modal Compression: Towards Human-comprehensible Semantic Compression [73.89616626853913]
Cross modal compression is a semantic compression framework for visual data. We show that our proposed CMC can achieve encouraging reconstructed results with an ultrahigh compression ratio.
arXiv Detail & Related papers (2022-09-06T15:31:11Z)
Estimating the Resize Parameter in End-to-end Learned Image Compression [50.20567320015102]
We describe a search-free resizing framework that can further improve the rate-distortion tradeoff of recent learned image compression models. Our results show that our new resizing parameter estimation framework can provide Bjontegaard-Delta rate (BD-rate) improvement of about 10% against leading perceptual quality engines.
arXiv Detail & Related papers (2022-04-26T01:35:02Z)
Implicit Neural Representations for Image Compression [103.78615661013623]
Implicit Neural Representations (INRs) have gained attention as a novel and effective representation for various data types. We propose the first comprehensive compression pipeline based on INRs including quantization, quantization-aware retraining and entropy coding. We find that our approach to source compression with INRs vastly outperforms similar prior work.
arXiv Detail & Related papers (2021-12-08T13:02:53Z)
Towards Compact CNNs via Collaborative Compression [166.86915086497433]
We propose a Collaborative Compression scheme, which joints channel pruning and tensor decomposition to compress CNN models. We achieve 52.9% FLOPs reduction by removing 48.4% parameters on ResNet-50 with only a Top-1 accuracy drop of 0.56% on ImageNet 2012.
arXiv Detail & Related papers (2021-05-24T12:07:38Z)
Analyzing and Mitigating JPEG Compression Defects in Deep Learning [69.04777875711646]
We present a unified study of the effects of JPEG compression on a range of common tasks and datasets. We show that there is a significant penalty on common performance metrics for high compression.
arXiv Detail & Related papers (2020-11-17T20:32:57Z)
Improving Inference for Neural Image Compression [31.999462074510305]
State-of-the-art methods build on hierarchical variational autoencoders to predict a compressible latent representation of each data point. We identify three approximation gaps which limit performance in the conventional approach. We propose remedies for each of these three limitations based on ideas related to iterative inference.
arXiv Detail & Related papers (2020-06-07T19:26:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.