Bridging the gap between image coding for machines and humans
- URL: http://arxiv.org/abs/2401.10732v1
- Date: Fri, 19 Jan 2024 14:49:56 GMT
- Title: Bridging the gap between image coding for machines and humans
- Authors: Nam Le, Honglei Zhang, Francesco Cricri, Ramin G. Youvalari, Hamed
Rezazadegan Tavakoli, Emre Aksu, Miska M. Hannuksela, Esa Rahtu
- Abstract summary: In many use cases, such as surveillance, it is important that the visual quality is not drastically deteriorated by the compression process.
Recent works on using neural network (NN) based ICM codecs have shown significant coding gains against traditional methods.
We propose an effective decoder finetuning scheme based on adversarial training to significantly enhance the visual quality of ICM.
- Score: 20.017766644567036
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image coding for machines (ICM) aims at reducing the bitrate required to
represent an image while minimizing the drop in machine vision analysis
accuracy. In many use cases, such as surveillance, it is also important that
the visual quality is not drastically deteriorated by the compression process.
Recent works on using neural network (NN) based ICM codecs have shown
significant coding gains against traditional methods; however, the decompressed
images, especially at low bitrates, often contain checkerboard artifacts. We
propose an effective decoder finetuning scheme based on adversarial training to
significantly enhance the visual quality of ICM codecs, while preserving the
machine analysis accuracy, without adding extra bitcost or parameters at the
inference phase. The results show complete removal of the checkerboard
artifacts at the negligible cost of -1.6% relative change in task performance
score. In the cases where some amount of artifacts is tolerable, such as when
machine consumption is the primary target, this technique can enhance both
pixel-fidelity and feature-fidelity scores without losing task performance.
Related papers
- Rate-Distortion-Cognition Controllable Versatile Neural Image Compression [47.72668401825835]
We propose a rate-distortion-cognition controllable versatile image compression method.
Our method yields satisfactory ICM performance and flexible Rate-DistortionCognition controlling.
arXiv Detail & Related papers (2024-07-16T13:17:51Z) - Semantic Ensemble Loss and Latent Refinement for High-Fidelity Neural Image Compression [58.618625678054826]
This study presents an enhanced neural compression method designed for optimal visual fidelity.
We have trained our model with a sophisticated semantic ensemble loss, integrating Charbonnier loss, perceptual loss, style loss, and a non-binary adversarial loss.
Our empirical findings demonstrate that this approach significantly improves the statistical fidelity of neural image compression.
arXiv Detail & Related papers (2024-01-25T08:11:27Z) - You Can Mask More For Extremely Low-Bitrate Image Compression [80.7692466922499]
Learned image compression (LIC) methods have experienced significant progress during recent years.
LIC methods fail to explicitly explore the image structure and texture components crucial for image compression.
We present DA-Mask that samples visible patches based on the structure and texture of original images.
We propose a simple yet effective masked compression model (MCM), the first framework that unifies LIC and LIC end-to-end for extremely low-bitrate compression.
arXiv Detail & Related papers (2023-06-27T15:36:22Z) - Analysis of the Effect of Low-Overhead Lossy Image Compression on the
Performance of Visual Crowd Counting for Smart City Applications [78.55896581882595]
Lossy image compression techniques can reduce the quality of the images, leading to accuracy degradation.
In this paper, we analyze the effect of applying low-overhead lossy image compression methods on the accuracy of visual crowd counting.
arXiv Detail & Related papers (2022-07-20T19:20:03Z) - Preprocessing Enhanced Image Compression for Machine Vision [14.895698385236937]
We propose a preprocessing enhanced image compression method for machine vision tasks.
Our framework is built upon the traditional non-differential codecs.
Experimental results show our method achieves a better tradeoff between the coding and the performance of the downstream machine vision tasks by saving about 20%.
arXiv Detail & Related papers (2022-06-12T03:36:38Z) - Reducing Redundancy in the Bottleneck Representation of the Autoencoders [98.78384185493624]
Autoencoders are a type of unsupervised neural networks, which can be used to solve various tasks.
We propose a scheme to explicitly penalize feature redundancies in the bottleneck representation.
We tested our approach across different tasks: dimensionality reduction using three different dataset, image compression using the MNIST dataset, and image denoising using fashion MNIST.
arXiv Detail & Related papers (2022-02-09T18:48:02Z) - A New Image Codec Paradigm for Human and Machine Uses [53.48873918537017]
A new scalable image paradigm for both human and machine uses is proposed in this work.
The high-level instance segmentation map and the low-level signal features are extracted with neural networks.
An image is designed and trained to achieve the general-quality image reconstruction with the 16-bit gray-scale profile and signal features.
arXiv Detail & Related papers (2021-12-19T06:17:38Z) - Image coding for machines: an end-to-end learned approach [23.92748892163087]
In this paper, we propose an image for machines which is neural network (NN) based and end-to-end learned.
Our results show that our NN-based task outperforms the state-of-the-art Versa-tile Video Coding (VVC) standard on the object detection and instance segmentation tasks.
To the best of our knowledge, this is the first end-to-end learned machine-targeted image distortion.
arXiv Detail & Related papers (2021-08-23T07:54:42Z) - End-to-end optimized image compression for machines, a study [3.0448872422956437]
An increasing share of image and video content is analyzed by machines rather than viewed by humans.
Conventional coding tools are challenging to specialize for machine tasks as they were originally designed for human perception.
neural network based codecs can be jointly trained end-to-end with any convolutional neural network (CNN)-based task model.
arXiv Detail & Related papers (2020-11-10T20:10:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.