Related papers: Video compression with low complexity CNN-based spatial resolution adaptation

Video compression with low complexity CNN-based spatial resolution adaptation

URL: http://arxiv.org/abs/2007.14726v1
Date: Wed, 29 Jul 2020 10:20:36 GMT
Title: Video compression with low complexity CNN-based spatial resolution adaptation
Authors: Di Ma, Fan Zhang and David R. Bull
Abstract summary: spatial resolution adaptation can be integrated within video compression to improve overall coding performance. A novel framework is proposed which supports the flexible allocation of complexity between the encoder and decoder.
Score: 15.431248645312309
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: It has recently been demonstrated that spatial resolution adaptation can be integrated within video compression to improve overall coding performance by spatially down-sampling before encoding and super-resolving at the decoder. Significant improvements have been reported when convolutional neural networks (CNNs) were used to perform the resolution up-sampling. However, this approach suffers from high complexity at the decoder due to the employment of CNN-based super-resolution. In this paper, a novel framework is proposed which supports the flexible allocation of complexity between the encoder and decoder. This approach employs a CNN model for video down-sampling at the encoder and uses a Lanczos3 filter to reconstruct full resolution at the decoder. The proposed method was integrated into the HEVC HM 16.20 software and evaluated on JVET UHD test sequences using the All Intra configuration. The experimental results demonstrate the potential of the proposed approach, with significant bitrate savings (more than 10%) over the original HEVC HM, coupled with reduced computational complexity at both encoder (29%) and decoder (10%).

Related papers

SIEDD: Shared-Implicit Encoder with Discrete Decoders [36.705337163276255]
Implicit Neural Representations (INRs) offer exceptional fidelity for video compression by learning per-video optimized functions.<n>Existing attempts to accelerate INR encoding often sacrifice reconstruction quality or crucial coordinate-level control.<n>We introduce SIEDD, a novel architecture that fundamentally accelerates INR encoding without these compromises.
arXiv Detail & Related papers (2025-06-29T19:39:43Z)
Reducing Storage of Pretrained Neural Networks by Rate-Constrained Quantization and Entropy Coding [56.066799081747845]
The ever-growing size of neural networks poses serious challenges on resource-constrained devices.<n>We propose a novel post-training compression framework that combines rate-aware quantization with entropy coding.<n>Our method allows for very fast decoding and is compatible with arbitrary quantization grids.
arXiv Detail & Related papers (2025-05-24T15:52:49Z)
Standard compliant video coding using low complexity, switchable neural wrappers [8.149130379436759]
We propose a new framework featuring standard compatibility, high performance, and low decoding complexity. We employ a set of jointly optimized neural pre- and post-processors, wrapping a standard video, to encode videos at different resolutions. We design a low complexity neural post-processor architecture that can handle different upsampling ratios.
arXiv Detail & Related papers (2024-07-10T06:36:45Z)
Video Compression with Arbitrary Rescaling Network [8.489428003916622]
We propose a rate-guided arbitrary rescaling network (RARN) for video resizing before encoding. The lightweight RARN structure can process FHD (1080p) content at real-time speed (91 FPS) and obtain a considerable rate reduction.
arXiv Detail & Related papers (2023-06-07T07:15:18Z)
Learned Video Compression via Heterogeneous Deformable Compensation Network [78.72508633457392]
We propose a learned video compression framework via heterogeneous deformable compensation strategy (HDCVC) to tackle the problems of unstable compression performance. More specifically, the proposed algorithm extracts features from the two adjacent frames to estimate content-Neighborhood heterogeneous deformable (HetDeform) kernel offsets. Experimental results indicate that HDCVC achieves superior performance than the recent state-of-the-art learned video compression approaches.
arXiv Detail & Related papers (2022-07-11T02:31:31Z)
Efficient VVC Intra Prediction Based on Deep Feature Fusion and Probability Estimation [57.66773945887832]
We propose to optimize Versatile Video Coding (VVC) complexity at intra-frame prediction, with a two-stage framework of deep feature fusion and probability estimation. Experimental results on standard database demonstrate the superiority of proposed method, especially for High Definition (HD) and Ultra-HD (UHD) video sequences.
arXiv Detail & Related papers (2022-05-07T08:01:32Z)
Neural JPEG: End-to-End Image Compression Leveraging a Standard JPEG Encoder-Decoder [73.48927855855219]
We propose a system that learns to improve the encoding performance by enhancing its internal neural representations on both the encoder and decoder ends. Experiments demonstrate that our approach successfully improves the rate-distortion performance over JPEG across various quality metrics.
arXiv Detail & Related papers (2022-01-27T20:20:03Z)
Small Lesion Segmentation in Brain MRIs with Subpixel Embedding [105.1223735549524]
We present a method to segment MRI scans of the human brain into ischemic stroke lesion and normal tissues. We propose a neural network architecture in the form of a standard encoder-decoder where predictions are guided by a spatial expansion embedding network.
arXiv Detail & Related papers (2021-09-18T00:21:17Z)
Dynamic Neural Representational Decoders for High-Resolution Semantic Segmentation [98.05643473345474]
We propose a novel decoder, termed dynamic neural representational decoder (NRD) As each location on the encoder's output corresponds to a local patch of the semantic labels, in this work, we represent these local patches of labels with compact neural networks. This neural representation enables our decoder to leverage the smoothness prior in the semantic label space, and thus makes our decoder more efficient.
arXiv Detail & Related papers (2021-07-30T04:50:56Z)
Perceptually-inspired super-resolution of compressed videos [18.72040343193715]
spatial resolution adaptation is a technique which has often been employed in video compression to enhance coding efficiency. Recent work has employed advanced super-resolution methods based on convolutional neural networks (CNNs) to further improve reconstruction quality. In this paper, a perceptually-inspired super-resolution approach (M-SRGAN) is proposed for spatial upsampling of compressed video using a modified CNN model.
arXiv Detail & Related papers (2021-06-15T13:50:24Z)
Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers [149.78470371525754]
We treat semantic segmentation as a sequence-to-sequence prediction task. Specifically, we deploy a pure transformer to encode an image as a sequence of patches. With the global context modeled in every layer of the transformer, this encoder can be combined with a simple decoder to provide a powerful segmentation model, termed SEgmentation TRansformer (SETR) SETR achieves new state of the art on ADE20K (50.28% mIoU), Pascal Context (55.83% mIoU) and competitive results on Cityscapes.
arXiv Detail & Related papers (2020-12-31T18:55:57Z)
Decomposition, Compression, and Synthesis (DCS)-based Video Coding: A Neural Exploration via Resolution-Adaptive Learning [30.54722074562783]
We decompose the input video into respective spatial texture frames (STF) at its native spatial resolution. Then, we compress them together using any popular video coder. Finally, we synthesize decoded STFs and TMFs for high-quality video reconstruction at the same resolution as its native input.
arXiv Detail & Related papers (2020-12-01T17:23:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.