Related papers: Exploring Distortion Prior with Latent Diffusion Models for Remote Sensing Image Compression

Exploring Distortion Prior with Latent Diffusion Models for Remote Sensing Image Compression

URL: http://arxiv.org/abs/2406.03961v2
Date: Mon, 07 Oct 2024 08:23:50 GMT
Title: Exploring Distortion Prior with Latent Diffusion Models for Remote Sensing Image Compression
Authors: Junhui Li, Jutao Li, Xingsong Hou, Huake Wang,
Abstract summary: We propose a latent diffusion model-based remote sensing image compression (LDM-RSIC) method. In the first stage, a self-encoder learns prior from the high-quality input image. In the second stage, the prior is generated through an LDM conditioned on the decoded image of an existing learning-based image compression algorithm.
Score: 9.742764207747697
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep learning-based image compression algorithms typically focus on designing encoding and decoding networks and improving the accuracy of entropy model estimation to enhance the rate-distortion (RD) performance. However, few algorithms leverage the compression distortion prior from existing compression algorithms to improve RD performance. In this paper, we propose a latent diffusion model-based remote sensing image compression (LDM-RSIC) method, which aims to enhance the final decoding quality of RS images by utilizing the generated distortion prior from a LDM. Our approach consists of two stages. In the first stage, a self-encoder learns prior from the high-quality input image. In the second stage, the prior is generated through an LDM, conditioned on the decoded image of an existing learning-based image compression algorithm, to be used as auxiliary information for generating the texture-rich enhanced image. To better utilize the prior, a channel attention and gate-based dynamic feature attention module (DFAM) is embedded into a Transformer-based multi-scale enhancement network (MEN) for image enhancement. Extensive experiments demonstrate the proposed LDM-RSIC significantly outperforms existing state-of-the-art traditional and learning-based image compression algorithms in terms of both subjective perception and objective metrics. Additionally, we use the LDM-based scheme to improve the traditional image compression algorithm JPEG2000 and obtain 32.00% bit savings on the DOTA testing set. The code will be available at https://github.com/mlkk518/LDM-RSIC.

Related papers

Higher fidelity perceptual image and video compression with a latent conditioned residual denoising diffusion model [55.2480439325792]
We propose a hybrid compression scheme optimized for perceptual quality, extending the approach of the CDC model with a decoder network.<n>We achieve up to +2dB PSNR fidelity improvements while maintaining comparable LPIPS and FID perceptual scores when compared with CDC.
arXiv Detail & Related papers (2025-05-19T14:13:14Z)
Ultra Lowrate Image Compression with Semantic Residual Coding and Compression-aware Diffusion [28.61304513668606]
ResULIC is a residual-guided ultra lowrate image compression system.<n>It incorporates residual signals into both semantic retrieval and the diffusion-based generation process.<n>It achieves superior objective and subjective performance compared to state-of-the-art diffusion-based methods.
arXiv Detail & Related papers (2025-05-13T06:51:23Z)
Multi-Scale Invertible Neural Network for Wide-Range Variable-Rate Learned Image Compression [90.59962443790593]
In this paper, we present a variable-rate image compression model based on invertible transform to overcome limitations. Specifically, we design a lightweight multi-scale invertible neural network, which maps the input image into multi-scale latent representations. Experimental results demonstrate that the proposed method achieves state-of-the-art performance compared to existing variable-rate methods.
arXiv Detail & Related papers (2025-03-27T09:08:39Z)
Exploiting Inter-Image Similarity Prior for Low-Bitrate Remote Sensing Image Compression [10.427300958330816]
We propose a codebook-based RS image compression (Code-RSIC) method with a generated discrete codebook. The code significantly outperforms state-of-the-art traditional and learning-based image compression algorithms in terms of perception quality.
arXiv Detail & Related papers (2024-07-17T03:33:16Z)
Enhancing Perception Quality in Remote Sensing Image Compression via Invertible Neural Network [10.427300958330816]
Decoding remote sensing images to achieve high perceptual quality, particularly at lows, remains a significant challenge. We propose the invertible neural network-based remote sensing image compression (INN-RSIC) method. Our INN-RSIC significantly outperforms the existing state-of-the-art traditional and deep learning-based image compression methods in terms of perception quality.
arXiv Detail & Related papers (2024-05-17T03:52:37Z)
Unifying Generation and Compression: Ultra-low bitrate Image Coding Via Multi-stage Transformer [35.500720262253054]
This paper introduces a novel Unified Image Generation-Compression (UIGC) paradigm, merging the processes of generation and compression. A key feature of the UIGC framework is the adoption of vector-quantized (VQ) image models for tokenization. Experiments demonstrate the superiority of the proposed UIGC framework over existing codecs in perceptual quality and human perception.
arXiv Detail & Related papers (2024-03-06T14:27:02Z)
You Can Mask More For Extremely Low-Bitrate Image Compression [80.7692466922499]
Learned image compression (LIC) methods have experienced significant progress during recent years. LIC methods fail to explicitly explore the image structure and texture components crucial for image compression. We present DA-Mask that samples visible patches based on the structure and texture of original images. We propose a simple yet effective masked compression model (MCM), the first framework that unifies LIC and LIC end-to-end for extremely low-bitrate compression.
arXiv Detail & Related papers (2023-06-27T15:36:22Z)
Neural Data-Dependent Transform for Learned Image Compression [72.86505042102155]
We build a neural data-dependent transform and introduce a continuous online mode decision mechanism to jointly optimize the coding efficiency for each individual image. The experimental results show the effectiveness of the proposed neural-syntax design and the continuous online mode decision mechanism.
arXiv Detail & Related papers (2022-03-09T14:56:48Z)
Reducing Redundancy in the Bottleneck Representation of the Autoencoders [98.78384185493624]
Autoencoders are a type of unsupervised neural networks, which can be used to solve various tasks. We propose a scheme to explicitly penalize feature redundancies in the bottleneck representation. We tested our approach across different tasks: dimensionality reduction using three different dataset, image compression using the MNIST dataset, and image denoising using fashion MNIST.
arXiv Detail & Related papers (2022-02-09T18:48:02Z)
Neural JPEG: End-to-End Image Compression Leveraging a Standard JPEG Encoder-Decoder [73.48927855855219]
We propose a system that learns to improve the encoding performance by enhancing its internal neural representations on both the encoder and decoder ends. Experiments demonstrate that our approach successfully improves the rate-distortion performance over JPEG across various quality metrics.
arXiv Detail & Related papers (2022-01-27T20:20:03Z)
Implicit Neural Representations for Image Compression [103.78615661013623]
Implicit Neural Representations (INRs) have gained attention as a novel and effective representation for various data types. We propose the first comprehensive compression pipeline based on INRs including quantization, quantization-aware retraining and entropy coding. We find that our approach to source compression with INRs vastly outperforms similar prior work.
arXiv Detail & Related papers (2021-12-08T13:02:53Z)
Modeling Lost Information in Lossy Image Compression [72.69327382643549]
Lossy image compression is one of the most commonly used operators for digital images. We propose a novel invertible framework called Invertible Lossy Compression (ILC) to largely mitigate the information loss problem.
arXiv Detail & Related papers (2020-06-22T04:04:56Z)
Generalized Octave Convolutions for Learned Multi-Frequency Image Compression [20.504561050200365]
We propose the first learned multi-frequency image compression and entropy coding approach. It is based on the recently developed octave convolutions to factorize the latents into high and low frequency (resolution) components. We show that the proposed generalized octave convolution can improve the performance of other auto-encoder-based computer vision tasks.
arXiv Detail & Related papers (2020-02-24T01:35:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.