Block Modulating Video Compression: An Ultra Low Complexity Image Compression Encoder for Resource Limited Platforms
- URL: http://arxiv.org/abs/2205.03677v2
- Date: Sat, 23 Nov 2024 05:06:05 GMT
- Title: Block Modulating Video Compression: An Ultra Low Complexity Image Compression Encoder for Resource Limited Platforms
- Authors: Siming Zheng, Yujia Xue, Waleed Tahir, Zhengjue Wang, Hao Zhang, Ziyi Meng, Gang Qu, Siwei Ma, Xin Yuan,
- Abstract summary: An ultra low-cost image Modulating Video Compression (BMVC) is proposed to be implemented on mobile platforms with low consumption of power and computation resources.
Two types of BMVC decoders, implemented by deep neural networks, are presented.
- Score: 35.76050232152349
- License:
- Abstract: We consider the image and video compression on resource limited platforms. An ultra low-cost image encoder, named Block Modulating Video Compression (BMVC) with an encoding complexity ${\cal O}(1)$ is proposed to be implemented on mobile platforms with low consumption of power and computation resources. We also develop two types of BMVC decoders, implemented by deep neural networks. The first BMVC decoder is based on the Plug-and-Play (PnP) algorithm, which is flexible to different compression ratios. And the second decoder is a memory efficient end-to-end convolutional neural network, which aims for real-time decoding. Extensive results on the high definition images and videos demonstrate the superior performance of the proposed codec and the robustness against bit quantization.
Related papers
- When Video Coding Meets Multimodal Large Language Models: A Unified Paradigm for Video Coding [112.44822009714461]
Cross-Modality Video Coding (CMVC) is a pioneering approach to explore multimodality representation and video generative models in video coding.
During decoding, previously encoded components and video generation models are leveraged to create multiple encoding-decoding modes.
Experiments indicate that TT2V achieves effective semantic reconstruction, while IT2V exhibits competitive perceptual consistency.
arXiv Detail & Related papers (2024-08-15T11:36:18Z) - Accelerating Learned Video Compression via Low-Resolution Representation Learning [18.399027308582596]
We introduce an efficiency-optimized framework for learned video compression that focuses on low-resolution representation learning.
Our method achieves performance levels on par with the low-decay P configuration of the H.266 reference software VTM.
arXiv Detail & Related papers (2024-07-23T12:02:57Z) - MISC: Ultra-low Bitrate Image Semantic Compression Driven by Large Multimodal Model [78.4051835615796]
This paper proposes a method called Multimodal Image Semantic Compression.
It consists of an LMM encoder for extracting the semantic information of the image, a map encoder to locate the region corresponding to the semantic, an image encoder generates an extremely compressed bitstream, and a decoder reconstructs the image based on the above information.
It can achieve optimal consistency and perception results while saving perceptual 50%, which has strong potential applications in the next generation of storage and communication.
arXiv Detail & Related papers (2024-02-26T17:11:11Z) - End-to-End Learnable Multi-Scale Feature Compression for VCM [8.037759667748768]
We propose a novel multi-scale feature compression method that enables the end-to-end optimization on the extracted features and the design of lightweight encoders.
Our model outperforms previous approaches by at least 52% BD-rate reduction and has $times5$ to $times27$ times less encoding time for object detection.
arXiv Detail & Related papers (2023-06-29T04:05:13Z) - Hierarchical B-frame Video Coding Using Two-Layer CANF without Motion
Coding [17.998825368770635]
We propose a novel B-frame coding architecture based on two-layer Augmented Normalization Flows (CANF)
Our proposed idea of video compression without motion coding offers a new direction for learned video coding.
The rate-distortion performance of our scheme is slightly lower than that of the state-of-the-art learned B-frame coding scheme, B-CANF, but outperforms other learned B-frame coding schemes.
arXiv Detail & Related papers (2023-04-05T18:36:28Z) - Low-complexity Deep Video Compression with A Distributed Coding
Architecture [4.5885672744218]
Prevalent predictive coding-based video compression methods rely on a heavy encoder to reduce temporal redundancy.
Traditional distributed coding methods suffer from a substantial performance gap to predictive coding ones.
We propose the first end-to-end distributed deep video compression framework to improve rate-distortion performance.
arXiv Detail & Related papers (2023-03-21T05:34:04Z) - Deep Lossy Plus Residual Coding for Lossless and Near-lossless Image
Compression [85.93207826513192]
We propose a unified and powerful deep lossy plus residual (DLPR) coding framework for both lossless and near-lossless image compression.
We solve the joint lossy and residual compression problem in the approach of VAEs.
In the near-lossless mode, we quantize the original residuals to satisfy a given $ell_infty$ error bound.
arXiv Detail & Related papers (2022-09-11T12:11:56Z) - Conditional Entropy Coding for Efficient Video Compression [82.35389813794372]
We propose a very simple and efficient video compression framework that only focuses on modeling the conditional entropy between frames.
We first show that a simple architecture modeling the entropy between the image latent codes is as competitive as other neural video compression works and video codecs.
We then propose a novel internal learning extension on top of this architecture that brings an additional 10% savings without trading off decoding speed.
arXiv Detail & Related papers (2020-08-20T20:01:59Z) - Learning for Video Compression with Hierarchical Quality and Recurrent
Enhancement [164.7489982837475]
We propose a Hierarchical Learned Video Compression (HLVC) method with three hierarchical quality layers and a recurrent enhancement network.
In our HLVC approach, the hierarchical quality benefits the coding efficiency, since the high quality information facilitates the compression and enhancement of low quality frames at encoder and decoder sides.
arXiv Detail & Related papers (2020-03-04T09:31:37Z) - A Unified End-to-End Framework for Efficient Deep Image Compression [35.156677716140635]
We propose a unified framework called Efficient Deep Image Compression (EDIC) based on three new technologies.
Specifically, we design an auto-encoder style network for learning based image compression.
Our EDIC method can also be readily incorporated with the Deep Video Compression (DVC) framework to further improve the video compression performance.
arXiv Detail & Related papers (2020-02-09T14:21:08Z) - Video Coding for Machines: A Paradigm of Collaborative Compression and
Intelligent Analytics [127.65410486227007]
Video coding, which targets to compress and reconstruct the whole frame, and feature compression, which only preserves and transmits the most critical information, stand at two ends of the scale.
Recent endeavors in imminent trends of video compression, e.g. deep learning based coding tools and end-to-end image/video coding, and MPEG-7 compact feature descriptor standards, promote the sustainable and fast development in their own directions.
In this paper, thanks to booming AI technology, e.g. prediction and generation models, we carry out exploration in the new area, Video Coding for Machines (VCM), arising from the emerging MPEG
arXiv Detail & Related papers (2020-01-10T17:24:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.