Scalable Image Coding for Humans and Machines Using Feature Fusion Network
- URL: http://arxiv.org/abs/2405.09152v5
- Date: Mon, 17 Jun 2024 01:55:20 GMT
- Title: Scalable Image Coding for Humans and Machines Using Feature Fusion Network
- Authors: Takahiro Shindo, Taiju Watanabe, Yui Tatsumi, Hiroshi Watanabe,
- Abstract summary: We propose a learning-based scalable image coding method for humans and machines that is compatible with numerous image recognition models.
Our approach confirms that the feature fusion network efficiently combines image compression models while reducing the number of parameters.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As image recognition models become more prevalent, scalable coding methods for machines and humans gain more importance. Applications of image recognition models include traffic monitoring and farm management. In these use cases, the scalable coding method proves effective because the tasks require occasional image checking by humans. Existing image compression methods for humans and machines meet these requirements to some extent. However, these compression methods are effective solely for specific image recognition models. We propose a learning-based scalable image coding method for humans and machines that is compatible with numerous image recognition models. We combine an image compression model for machines with a compression model, providing additional information to facilitate image decoding for humans. The features in these compression models are fused using a feature fusion network to achieve efficient image compression. Our method's additional information compression model is adjusted to reduce the number of parameters by enabling combinations of features of different sizes in the feature fusion network. Our approach confirms that the feature fusion network efficiently combines image compression models while reducing the number of parameters. Furthermore, we demonstrate the effectiveness of the proposed scalable coding method by evaluating the image compression performance in terms of decoded image quality and bitrate.
Related papers
- Multi-Scale Invertible Neural Network for Wide-Range Variable-Rate Learned Image Compression [90.59962443790593]
In this paper, we present a variable-rate image compression model based on invertible transform to overcome limitations.
Specifically, we design a lightweight multi-scale invertible neural network, which maps the input image into multi-scale latent representations.
Experimental results demonstrate that the proposed method achieves state-of-the-art performance compared to existing variable-rate methods.
arXiv Detail & Related papers (2025-03-27T09:08:39Z) - Guided Diffusion for the Extension of Machine Vision to Human Visual Perception [0.0]
We propose a method for extending machine vision to human visual perception using guided diffusion.
Guided diffusion acts as a bridge between machine vision and human perception, enabling transitions between them without any additional overhead.
arXiv Detail & Related papers (2025-03-23T03:04:26Z) - MambaIC: State Space Models for High-Performance Learned Image Compression [53.991726013454695]
A high-performance image compression algorithm is crucial for real-time information transmission across numerous fields.
Inspired by the effectiveness of state space models (SSMs) in capturing long-range dependencies, we leverage SSMs to address computational inefficiency in existing methods.
We propose an enhanced image compression approach through refined context modeling, which we term MambaIC.
arXiv Detail & Related papers (2025-03-16T11:32:34Z) - Compact Latent Representation for Image Compression (CLRIC) [16.428925911432344]
Current image compression models often require separate models for each quality level, making them resource-intensive in terms of both training and storage.
We propose an innovative approach that utilizes latent variables from pre-existing trained models for perceptual image compression.
Our method achieves comparable perceptual quality to state-of-the-art learned image compression models while being both model-agnostic and resolution-agnostic.
arXiv Detail & Related papers (2025-02-20T13:20:56Z) - Toward Scalable Image Feature Compression: A Content-Adaptive and Diffusion-Based Approach [44.03561901593423]
This paper introduces a content-adaptive diffusion model for scalable image compression.
The proposed method encodes fine textures through a diffusion process, enhancing perceptual quality.
Experiments demonstrate the effectiveness of the proposed framework in both image reconstruction and downstream machine vision tasks.
arXiv Detail & Related papers (2024-10-08T15:48:34Z) - Refining Coded Image in Human Vision Layer Using CNN-Based Post-Processing [0.0]
We propose a method to enhance the quality of decoded images for humans by integrating post-processing into scalable coding scheme.
Experimental results show that the post-processing improves compression performance.
The effectiveness of the proposed method is validated through comparisons with traditional methods.
arXiv Detail & Related papers (2024-05-20T09:19:01Z) - A Training-Free Defense Framework for Robust Learned Image Compression [48.41990144764295]
We study the robustness of learned image compression models against adversarial attacks.
We present a training-free defense technique based on simple image transform functions.
arXiv Detail & Related papers (2024-01-22T12:50:21Z) - Transferable Learned Image Compression-Resistant Adversarial Perturbations [66.46470251521947]
Adversarial attacks can readily disrupt the image classification system, revealing the vulnerability of DNN-based recognition tasks.
We introduce a new pipeline that targets image classification models that utilize learned image compressors as pre-processing modules.
arXiv Detail & Related papers (2024-01-06T03:03:28Z) - Universal Deep Image Compression via Content-Adaptive Optimization with
Adapters [43.291753358414255]
Deep image compression performs better than conventional codecs, such as JPEG, on natural images.
Deep image compression is learning-based and encounters a problem: the compression performance deteriorates significantly for out-of-domain images.
This study aims to compress images belonging to arbitrary domains, such as natural images, line drawings, and comics.
arXiv Detail & Related papers (2022-11-02T07:01:30Z) - Estimating the Resize Parameter in End-to-end Learned Image Compression [50.20567320015102]
We describe a search-free resizing framework that can further improve the rate-distortion tradeoff of recent learned image compression models.
Our results show that our new resizing parameter estimation framework can provide Bjontegaard-Delta rate (BD-rate) improvement of about 10% against leading perceptual quality engines.
arXiv Detail & Related papers (2022-04-26T01:35:02Z) - Variable-Rate Deep Image Compression through Spatially-Adaptive Feature
Transform [58.60004238261117]
We propose a versatile deep image compression network based on Spatial Feature Transform (SFT arXiv:1804.02815)
Our model covers a wide range of compression rates using a single model, which is controlled by arbitrary pixel-wise quality maps.
The proposed framework allows us to perform task-aware image compressions for various tasks.
arXiv Detail & Related papers (2021-08-21T17:30:06Z) - Quantization Guided JPEG Artifact Correction [69.04777875711646]
We develop a novel architecture for artifact correction using the JPEG files quantization matrix.
This allows our single model to achieve state-of-the-art performance over models trained for specific quality settings.
arXiv Detail & Related papers (2020-04-17T00:10:08Z) - Discernible Image Compression [124.08063151879173]
This paper aims to produce compressed images by pursuing both appearance and perceptual consistency.
Based on the encoder-decoder framework, we propose using a pre-trained CNN to extract features of the original and compressed images.
Experiments on benchmarks demonstrate that images compressed by using the proposed method can also be well recognized by subsequent visual recognition and detection models.
arXiv Detail & Related papers (2020-02-17T07:35:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.