Predicting Satisfied User and Machine Ratio for Compressed Images: A Unified Approach
- URL: http://arxiv.org/abs/2412.17477v1
- Date: Mon, 23 Dec 2024 11:09:30 GMT
- Title: Predicting Satisfied User and Machine Ratio for Compressed Images: A Unified Approach
- Authors: Qi Zhang, Shanshe Wang, Xinfeng Zhang, Siwei Ma, Jingshan Pan, Wen Gao,
- Abstract summary: We create a deep learning-based model to predict Satisfied User Ratio (SUR) and Satisfied Machine Ratio (SMR) of compressed images simultaneously.
Experimental results indicate that the proposed model significantly outperforms state-of-the-art SUR and SMR prediction methods.
- Score: 58.71009078356928
- License:
- Abstract: Nowadays, high-quality images are pursued by both humans for better viewing experience and by machines for more accurate visual analysis. However, images are usually compressed before being consumed, decreasing their quality. It is meaningful to predict the perceptual quality of compressed images for both humans and machines, which guides the optimization for compression. In this paper, we propose a unified approach to address this. Specifically, we create a deep learning-based model to predict Satisfied User Ratio (SUR) and Satisfied Machine Ratio (SMR) of compressed images simultaneously. We first pre-train a feature extractor network on a large-scale SMR-annotated dataset with human perception-related quality labels generated by diverse image quality models, which simulates the acquisition of SUR labels. Then, we propose an MLP-Mixer-based network to predict SUR and SMR by leveraging and fusing the extracted multi-layer features. We introduce a Difference Feature Residual Learning (DFRL) module to learn more discriminative difference features. We further use a Multi-Head Attention Aggregation and Pooling (MHAAP) layer to aggregate difference features and reduce their redundancy. Experimental results indicate that the proposed model significantly outperforms state-of-the-art SUR and SMR prediction methods. Moreover, our joint learning scheme of human and machine perceptual quality prediction tasks is effective at improving the performance of both.
Related papers
- CALLIC: Content Adaptive Learning for Lossless Image Compression [64.47244912937204]
CALLIC sets a new state-of-the-art (SOTA) for learned lossless image compression.
We propose a content-aware autoregressive self-attention mechanism by leveraging convolutional gating operations.
During encoding, we decompose pre-trained layers, including depth-wise convolutions, using low-rank matrices and then adapt the incremental weights on testing image by Rate-guided Progressive Fine-Tuning (RPFT)
RPFT fine-tunes with gradually increasing patches that are sorted in descending order by estimated entropy, optimizing learning process and reducing adaptation time.
arXiv Detail & Related papers (2024-12-23T10:41:18Z) - Good, Cheap, and Fast: Overfitted Image Compression with Wasserstein Distortion [13.196774986841469]
We show that by focusing on modeling visual perception rather than the data distribution, we can achieve a good trade-off between visual quality and bit rate.
We do this by optimizing C3, an overfitted image, for Wasserstein Distortion (WD) and evaluating the image reconstructions with a human rater study.
arXiv Detail & Related papers (2024-11-30T15:05:01Z) - GAN-based Image Compression with Improved RDO Process [20.00340507091567]
We present a novel GAN-based image compression approach with improved rate-distortion optimization process.
To achieve this, we utilize the DISTS and MS-SSIM metrics to measure perceptual degeneration in color, texture, and structure.
The proposed method outperforms the existing GAN-based methods and the state-of-the-art hybrid (i.e., VVC)
arXiv Detail & Related papers (2023-06-18T03:21:11Z) - Deep Optimal Transport: A Practical Algorithm for Photo-realistic Image Restoration [31.58365182858562]
We propose an image restoration algorithm that can control the perceptual quality and/or the mean square error (MSE) of any pre-trained model.
Given about a dozen images restored by the model, it can significantly improve the perceptual quality and/or the MSE of the model for newly restored images without further training.
arXiv Detail & Related papers (2023-06-04T12:21:53Z) - Machine Perception-Driven Image Compression: A Layered Generative
Approach [32.23554195427311]
layered generative image compression model is proposed to achieve high human vision-oriented image reconstructed quality.
Task-agnostic learning-based compression model is proposed, which effectively supports various compressed domain-based analytical tasks.
Joint optimization schedule is adopted to acquire best balance point among compression ratio, reconstructed image quality, and downstream perception performance.
arXiv Detail & Related papers (2023-04-14T02:12:38Z) - Perceptual Video Coding for Machines via Satisfied Machine Ratio
Modeling [66.56355316611598]
Satisfied Machine Ratio (SMR) is a metric that evaluates the perceptual quality of compressed images and videos for machines.
SMR enables perceptual coding for machines and propels Video Coding for Machines from specificity to generality.
arXiv Detail & Related papers (2022-11-13T03:16:36Z) - Estimating the Resize Parameter in End-to-end Learned Image Compression [50.20567320015102]
We describe a search-free resizing framework that can further improve the rate-distortion tradeoff of recent learned image compression models.
Our results show that our new resizing parameter estimation framework can provide Bjontegaard-Delta rate (BD-rate) improvement of about 10% against leading perceptual quality engines.
arXiv Detail & Related papers (2022-04-26T01:35:02Z) - Perceptually Optimizing Deep Image Compression [53.705543593594285]
Mean squared error (MSE) and $ell_p$ norms have largely dominated the measurement of loss in neural networks.
We propose a different proxy approach to optimize image analysis networks against quantitative perceptual models.
arXiv Detail & Related papers (2020-07-03T14:33:28Z) - Discernible Image Compression [124.08063151879173]
This paper aims to produce compressed images by pursuing both appearance and perceptual consistency.
Based on the encoder-decoder framework, we propose using a pre-trained CNN to extract features of the original and compressed images.
Experiments on benchmarks demonstrate that images compressed by using the proposed method can also be well recognized by subsequent visual recognition and detection models.
arXiv Detail & Related papers (2020-02-17T07:35:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.