Sandwiched Video Compression: Efficiently Extending the Reach of
Standard Codecs with Neural Wrappers
- URL: http://arxiv.org/abs/2303.11473v2
- Date: Wed, 5 Jul 2023 20:41:25 GMT
- Title: Sandwiched Video Compression: Efficiently Extending the Reach of
Standard Codecs with Neural Wrappers
- Authors: Berivan Isik, Onur G. Guleryuz, Danhang Tang, Jonathan Taylor, Philip
A. Chou
- Abstract summary: We propose a video compression system that wraps neural networks around a standard video.
Networks are trained jointly to optimize a rate-distortion loss function.
We observe 30% improvements in rate at the same quality over HEVC.
- Score: 11.968545394054816
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose sandwiched video compression -- a video compression system that
wraps neural networks around a standard video codec. The sandwich framework
consists of a neural pre- and post-processor with a standard video codec
between them. The networks are trained jointly to optimize a rate-distortion
loss function with the goal of significantly improving over the standard codec
in various compression scenarios. End-to-end training in this setting requires
a differentiable proxy for the standard video codec, which incorporates
temporal processing with motion compensation, inter/intra mode decisions, and
in-loop filtering. We propose differentiable approximations to key video codec
components and demonstrate that, in addition to providing meaningful
compression improvements over the standard codec, the neural codes of the
sandwich lead to significantly better rate-distortion performance in two
important scenarios.When transporting high-resolution video via low-resolution
HEVC, the sandwich system obtains 6.5 dB improvements over standard HEVC. More
importantly, using the well-known perceptual similarity metric, LPIPS, we
observe 30% improvements in rate at the same quality over HEVC. Last but not
least, we show that pre- and post-processors formed by very
modestly-parameterized, light-weight networks can closely approximate these
results.
Related papers
- Accelerating Learned Video Compression via Low-Resolution Representation Learning [18.399027308582596]
We introduce an efficiency-optimized framework for learned video compression that focuses on low-resolution representation learning.
Our method achieves performance levels on par with the low-decay P configuration of the H.266 reference software VTM.
arXiv Detail & Related papers (2024-07-23T12:02:57Z) - Standard compliant video coding using low complexity, switchable neural wrappers [8.149130379436759]
We propose a new framework featuring standard compatibility, high performance, and low decoding complexity.
We employ a set of jointly optimized neural pre- and post-processors, wrapping a standard video, to encode videos at different resolutions.
We design a low complexity neural post-processor architecture that can handle different upsampling ratios.
arXiv Detail & Related papers (2024-07-10T06:36:45Z) - Prediction and Reference Quality Adaptation for Learned Video Compression [54.58691829087094]
We propose a confidence-based prediction quality adaptation (PQA) module to provide explicit discrimination for the spatial and channel-wise prediction quality difference.
We also propose a reference quality adaptation (RQA) module and an associated repeat-long training strategy to provide dynamic spatially variant filters for diverse reference qualities.
arXiv Detail & Related papers (2024-06-20T09:03:26Z) - Compression-Realized Deep Structural Network for Video Quality Enhancement [78.13020206633524]
This paper focuses on the task of quality enhancement for compressed videos.
A new paradigm is urgently needed for a more "conscious" process of quality enhancement.
We propose the Compression-Realize Deep Structural Network (CRDS), introducing three inductive biases aligned with the three primary processes in the classic compression domain.
arXiv Detail & Related papers (2024-05-10T09:18:17Z) - Video Compression with Arbitrary Rescaling Network [8.489428003916622]
We propose a rate-guided arbitrary rescaling network (RARN) for video resizing before encoding.
The lightweight RARN structure can process FHD (1080p) content at real-time speed (91 FPS) and obtain a considerable rate reduction.
arXiv Detail & Related papers (2023-06-07T07:15:18Z) - Learned Video Compression via Heterogeneous Deformable Compensation
Network [78.72508633457392]
We propose a learned video compression framework via heterogeneous deformable compensation strategy (HDCVC) to tackle the problems of unstable compression performance.
More specifically, the proposed algorithm extracts features from the two adjacent frames to estimate content-Neighborhood heterogeneous deformable (HetDeform) kernel offsets.
Experimental results indicate that HDCVC achieves superior performance than the recent state-of-the-art learned video compression approaches.
arXiv Detail & Related papers (2022-07-11T02:31:31Z) - Perceptual Coding for Compressed Video Understanding: A New Framework
and Benchmark [57.23523738351178]
We propose the first coding framework for compressed video understanding, where another learnable perceptual bitstream is introduced and simultaneously transported with the video bitstream.
Our framework can enjoy the best of both two worlds, (1) highly efficient content-coding of industrial video and (2) flexible perceptual-coding of neural networks (NNs)
arXiv Detail & Related papers (2022-02-06T16:29:15Z) - Leveraging Bitstream Metadata for Fast, Accurate, Generalized Compressed
Video Quality Enhancement [74.1052624663082]
We develop a deep learning architecture capable of restoring detail to compressed videos.
We show that this improves restoration accuracy compared to prior compression correction methods.
We condition our model on quantization data which is readily available in the bitstream.
arXiv Detail & Related papers (2022-01-31T18:56:04Z) - Learning to Compress Videos without Computing Motion [39.46212197928986]
We propose a new deep learning video compression architecture that does not require motion estimation.
Our framework exploits the regularities inherent to video motion, which we capture by using displaced frame differences as video representations.
Our experiments show that our compression model, which we call the MOtionless VIdeo Codec (MOVI-Codec), learns how to efficiently compress videos without computing motion.
arXiv Detail & Related papers (2020-09-29T15:49:25Z) - Variable Rate Video Compression using a Hybrid Recurrent Convolutional
Learning Framework [1.9290392443571382]
This paper presents PredEncoder, a hybrid video compression framework based on the concept of predictive auto-encoding.
A variable-rate block encoding scheme has been proposed in the paper that leads to remarkably high quality to bit-rate ratios.
arXiv Detail & Related papers (2020-04-08T20:49:25Z) - Content Adaptive and Error Propagation Aware Deep Video Compression [110.31693187153084]
We propose a content adaptive and error propagation aware video compression system.
Our method employs a joint training strategy by considering the compression performance of multiple consecutive frames instead of a single frame.
Instead of using the hand-crafted coding modes in the traditional compression systems, we design an online encoder updating scheme in our system.
arXiv Detail & Related papers (2020-03-25T09:04:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.