Learned Distributed Image Compression with Multi-Scale Patch Matching in
Feature Domai
- URL: http://arxiv.org/abs/2209.02514v1
- Date: Tue, 6 Sep 2022 14:06:46 GMT
- Title: Learned Distributed Image Compression with Multi-Scale Patch Matching in
Feature Domai
- Authors: Yujun Huang, Bin Chen, Shiyu Qin, Jiawei Li, Yaowei Wang, Tao Dai,
Shu-Tao Xia
- Abstract summary: We propose Multi-Scale Feature Domain Patch Matching (MSFDPM) to fully utilize side information at the decoder of the distributed image compression model.
MSFDPM consists of a side information feature extractor, a multi-scale feature domain patch matching module, and a multi-scale feature fusion network.
Our patch matching in a multi-scale feature domain further improves compression rate by about 20% compared with the patch matching method at image domain.
- Score: 62.88240343479615
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Beyond achieving higher compression efficiency over classical image
compression codecs, deep image compression is expected to be improved with
additional side information, e.g., another image from a different perspective
of the same scene. To better utilize the side information under the distributed
compression scenario, the existing method (Ayzik and Avidan 2020) only
implements patch matching at the image domain to solve the parallax problem
caused by the difference in viewing points. However, the patch matching at the
image domain is not robust to the variance of scale, shape, and illumination
caused by the different viewing angles, and can not make full use of the rich
texture information of the side information image. To resolve this issue, we
propose Multi-Scale Feature Domain Patch Matching (MSFDPM) to fully utilizes
side information at the decoder of the distributed image compression model.
Specifically, MSFDPM consists of a side information feature extractor, a
multi-scale feature domain patch matching module, and a multi-scale feature
fusion network. Furthermore, we reuse inter-patch correlation from the shallow
layer to accelerate the patch matching of the deep layer. Finally, we nd that
our patch matching in a multi-scale feature domain further improves compression
rate by about 20% compared with the patch matching method at image domain
(Ayzik and Avidan 2020).
Related papers
- Progressive Learning with Visual Prompt Tuning for Variable-Rate Image
Compression [60.689646881479064]
We propose a progressive learning paradigm for transformer-based variable-rate image compression.
Inspired by visual prompt tuning, we use LPM to extract prompts for input images and hidden features at the encoder side and decoder side, respectively.
Our model outperforms all current variable image methods in terms of rate-distortion performance and approaches the state-of-the-art fixed image compression methods trained from scratch.
arXiv Detail & Related papers (2023-11-23T08:29:32Z) - You Can Mask More For Extremely Low-Bitrate Image Compression [80.7692466922499]
Learned image compression (LIC) methods have experienced significant progress during recent years.
LIC methods fail to explicitly explore the image structure and texture components crucial for image compression.
We present DA-Mask that samples visible patches based on the structure and texture of original images.
We propose a simple yet effective masked compression model (MCM), the first framework that unifies LIC and LIC end-to-end for extremely low-bitrate compression.
arXiv Detail & Related papers (2023-06-27T15:36:22Z) - DBAT: Dynamic Backward Attention Transformer for Material Segmentation
with Cross-Resolution Patches [8.812837829361923]
We propose the Dynamic Backward Attention Transformer (DBAT) to aggregate cross-resolution features.
Experiments show that our DBAT achieves an accuracy of 86.85%, which is the best performance among state-of-the-art real-time models.
We further align features to semantic labels, performing network dissection, to infer that the proposed model can extract material-related features better than other methods.
arXiv Detail & Related papers (2023-05-06T03:47:20Z) - HIPA: Hierarchical Patch Transformer for Single Image Super Resolution [62.7081074931892]
This paper presents HIPA, a novel Transformer architecture that progressively recovers the high resolution image using a hierarchical patch partition.
We build a cascaded model that processes an input image in multiple stages, where we start with tokens with small patch sizes and gradually merge to the full resolution.
Such a hierarchical patch mechanism not only explicitly enables feature aggregation at multiple resolutions but also adaptively learns patch-aware features for different image regions.
arXiv Detail & Related papers (2022-03-19T05:09:34Z) - The Devil Is in the Details: Window-based Attention for Image
Compression [58.1577742463617]
Most existing learned image compression models are based on Convolutional Neural Networks (CNNs)
In this paper, we study the effects of multiple kinds of attention mechanisms for local features learning, then introduce a more straightforward yet effective window-based local attention block.
The proposed window-based attention is very flexible which could work as a plug-and-play component to enhance CNN and Transformer models.
arXiv Detail & Related papers (2022-03-16T07:55:49Z) - Patch-Based Stochastic Attention for Image Editing [4.8201607588546]
We propose an efficient attention layer based on the algorithm PatchMatch, which is used for determining approximate nearest neighbors.
We demonstrate the usefulness of PSAL on several image editing tasks, such as image inpainting, guided image colorization, and single-image super-resolution.
arXiv Detail & Related papers (2022-02-07T13:42:00Z) - Variable-Rate Deep Image Compression through Spatially-Adaptive Feature
Transform [58.60004238261117]
We propose a versatile deep image compression network based on Spatial Feature Transform (SFT arXiv:1804.02815)
Our model covers a wide range of compression rates using a single model, which is controlled by arbitrary pixel-wise quality maps.
The proposed framework allows us to perform task-aware image compressions for various tasks.
arXiv Detail & Related papers (2021-08-21T17:30:06Z) - SimPatch: A Nearest Neighbor Similarity Match between Image Patches [0.0]
We try to use large patches instead of relatively small patches so that each patch contains more information.
We use different feature extraction mechanisms to extract the features of each individual image patches which forms a feature matrix.
The nearest patches are calculated using two different nearest neighbor algorithms in this paper for a query patch for a given image.
arXiv Detail & Related papers (2020-08-07T10:51:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.