DNN-Compressed Domain Visual Recognition with Feature Adaptation
- URL: http://arxiv.org/abs/2305.08000v2
- Date: Wed, 26 Jul 2023 09:43:15 GMT
- Title: DNN-Compressed Domain Visual Recognition with Feature Adaptation
- Authors: Yingpeng Deng and Lina J. Karam
- Abstract summary: Learning-based image compression was shown to achieve a competitive performance with state-of-the-art transform-based codecs.
This motivated the development of new learning-based visual compression standards such as JPEG-AI.
This paper is concerned with learning-based compression schemes whose compressed-domain representations can be utilized to perform visual processing and computer vision tasks directly in the compressed domain.
- Score: 19.79803434998116
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Learning-based image compression was shown to achieve a competitive
performance with state-of-the-art transform-based codecs. This motivated the
development of new learning-based visual compression standards such as JPEG-AI.
Of particular interest to these emerging standards is the development of
learning-based image compression systems targeting both humans and machines.
This paper is concerned with learning-based compression schemes whose
compressed-domain representations can be utilized to perform visual processing
and computer vision tasks directly in the compressed domain. In our work, we
adopt a learning-based compressed-domain classification framework for
performing visual recognition using the compressed-domain latent representation
at varying bit-rates. We propose a novel feature adaptation module integrating
a lightweight attention model to adaptively emphasize and enhance the key
features within the extracted channel-wise information. Also, we design an
adaptation training strategy to utilize the pretrained pixel-domain weights.
For comparison, in addition to the performance results that are obtained using
our proposed latent-based compressed-domain method, we also present performance
results using compressed but fully decoded images in the pixel domain as well
as original uncompressed images. The obtained performance results show that our
proposed compressed-domain classification model can distinctly outperform the
existing compressed-domain classification models, and that it can also yield
similar accuracy results with a much higher computational efficiency as
compared to the pixel-domain models that are trained using fully decoded
images.
Related papers
- Toward Scalable Image Feature Compression: A Content-Adaptive and Diffusion-Based Approach [44.03561901593423]
This paper introduces a content-adaptive diffusion model for scalable image compression.
The proposed method encodes fine textures through a diffusion process, enhancing perceptual quality.
Experiments demonstrate the effectiveness of the proposed framework in both image reconstruction and downstream machine vision tasks.
arXiv Detail & Related papers (2024-10-08T15:48:34Z) - Transferable Learned Image Compression-Resistant Adversarial Perturbations [66.46470251521947]
Adversarial attacks can readily disrupt the image classification system, revealing the vulnerability of DNN-based recognition tasks.
We introduce a new pipeline that targets image classification models that utilize learned image compressors as pre-processing modules.
arXiv Detail & Related papers (2024-01-06T03:03:28Z) - Pixel-Inconsistency Modeling for Image Manipulation Localization [59.968362815126326]
Digital image forensics plays a crucial role in image authentication and manipulation localization.
This paper presents a generalized and robust manipulation localization model through the analysis of pixel inconsistency artifacts.
Experiments show that our method successfully extracts inherent pixel-inconsistency forgery fingerprints.
arXiv Detail & Related papers (2023-09-30T02:54:51Z) - The Devil Is in the Details: Window-based Attention for Image
Compression [58.1577742463617]
Most existing learned image compression models are based on Convolutional Neural Networks (CNNs)
In this paper, we study the effects of multiple kinds of attention mechanisms for local features learning, then introduce a more straightforward yet effective window-based local attention block.
The proposed window-based attention is very flexible which could work as a plug-and-play component to enhance CNN and Transformer models.
arXiv Detail & Related papers (2022-03-16T07:55:49Z) - Learned Image Compression for Machine Perception [17.40776913809306]
We develop a framework that produces a compression format suitable for both human perception and machine perception.
We show that representations can be learned that simultaneously optimize for compression and performance on core vision tasks.
arXiv Detail & Related papers (2021-11-03T14:39:09Z) - Variable-Rate Deep Image Compression through Spatially-Adaptive Feature
Transform [58.60004238261117]
We propose a versatile deep image compression network based on Spatial Feature Transform (SFT arXiv:1804.02815)
Our model covers a wide range of compression rates using a single model, which is controlled by arbitrary pixel-wise quality maps.
The proposed framework allows us to perform task-aware image compressions for various tasks.
arXiv Detail & Related papers (2021-08-21T17:30:06Z) - Learning-based Compression for Material and Texture Recognition [23.668803886355683]
This paper is concerned with learning-based compression schemes whose compressed-domain representations can be utilized to perform visual processing and computer vision tasks directly in the compressed domain.
We adopt the learning-based JPEG-AI framework for performing material and texture recognition using the compressed-domain latent representation at varing bit-rates.
It is also shown that the compressed-domain classification can yield a competitive performance in terms of Top-1 and Top-5 accuracy while using a smaller reduced-complexity classification model.
arXiv Detail & Related papers (2021-04-16T23:16:26Z) - Discernible Image Compression [124.08063151879173]
This paper aims to produce compressed images by pursuing both appearance and perceptual consistency.
Based on the encoder-decoder framework, we propose using a pre-trained CNN to extract features of the original and compressed images.
Experiments on benchmarks demonstrate that images compressed by using the proposed method can also be well recognized by subsequent visual recognition and detection models.
arXiv Detail & Related papers (2020-02-17T07:35:08Z) - Learning End-to-End Lossy Image Compression: A Benchmark [90.35363142246806]
We first conduct a comprehensive literature survey of learned image compression methods.
We describe milestones in cutting-edge learned image-compression methods, review a broad range of existing works, and provide insights into their historical development routes.
By introducing a coarse-to-fine hyperprior model for entropy estimation and signal reconstruction, we achieve improved rate-distortion performance.
arXiv Detail & Related papers (2020-02-10T13:13:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.