Parallelized Rate-Distortion Optimized Quantization Using Deep Learning
- URL: http://arxiv.org/abs/2012.06380v1
- Date: Fri, 11 Dec 2020 14:28:30 GMT
- Title: Parallelized Rate-Distortion Optimized Quantization Using Deep Learning
- Authors: Dana Kianfar, Auke Wiggers, Amir Said, Reza Pourreza, Taco Cohen
- Abstract summary: Rate-Distortion Optimized Quantization (RDOQ) has played an important role in the coding performance of recent video compression standards such as H.264/AVC, H.265/HEVC, VP9 and AV1.
This work addresses this limitation using a neural network-based approach, which learns to trade-off rate and distortion during offline supervised training.
- Score: 9.886383889250064
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Rate-Distortion Optimized Quantization (RDOQ) has played an important role in
the coding performance of recent video compression standards such as H.264/AVC,
H.265/HEVC, VP9 and AV1. This scheme yields significant reductions in bit-rate
at the expense of relatively small increases in distortion. Typically, RDOQ
algorithms are prohibitively expensive to implement on real-time hardware
encoders due to their sequential nature and their need to frequently obtain
entropy coding costs. This work addresses this limitation using a neural
network-based approach, which learns to trade-off rate and distortion during
offline supervised training. As these networks are based solely on standard
arithmetic operations that can be executed on existing neural network hardware,
no additional area-on-chip needs to be reserved for dedicated RDOQ circuitry.
We train two classes of neural networks, a fully-convolutional network and an
auto-regressive network, and evaluate each as a post-quantization step designed
to refine cheap quantization schemes such as scalar quantization (SQ). Both
network architectures are designed to have a low computational overhead. After
training they are integrated into the HM 16.20 implementation of HEVC, and
their video coding performance is evaluated on a subset of the H.266/VVC SDR
common test sequences. Comparisons are made to RDOQ and SQ implementations in
HM 16.20. Our method achieves 1.64% BD-rate savings on luminosity compared to
the HM SQ anchor, and on average reaches 45% of the performance of the
iterative HM RDOQ algorithm.
Related papers
- TCCT-Net: Two-Stream Network Architecture for Fast and Efficient Engagement Estimation via Behavioral Feature Signals [58.865901821451295]
We present a novel two-stream feature fusion "Tensor-Convolution and Convolution-Transformer Network" (TCCT-Net) architecture.
To better learn the meaningful patterns in the temporal-spatial domain, we design a "CT" stream that integrates a hybrid convolutional-transformer.
In parallel, to efficiently extract rich patterns from the temporal-frequency domain, we introduce a "TC" stream that uses Continuous Wavelet Transform (CWT) to represent information in a 2D tensor form.
arXiv Detail & Related papers (2024-04-15T06:01:48Z) - Channelformer: Attention based Neural Solution for Wireless Channel
Estimation and Effective Online Training [1.0499453838486013]
We propose an encoder-decoder neural architecture (called Channelformer) to achieve improved channel estimation.
We employ multi-head attention in the encoder and a residual convolutional neural architecture as the decoder.
We also propose an effective online training method based on the fifth generation (5G) new radio (NR) configuration for the modern communication systems.
arXiv Detail & Related papers (2023-02-08T23:18:23Z) - CADyQ: Content-Aware Dynamic Quantization for Image Super-Resolution [55.50793823060282]
We propose a novel Content-Aware Dynamic Quantization (CADyQ) method for image super-resolution (SR) networks.
CADyQ allocates optimal bits to local regions and layers adaptively based on the local contents of an input image.
The pipeline has been tested on various SR networks and evaluated on several standard benchmarks.
arXiv Detail & Related papers (2022-07-21T07:50:50Z) - Video Coding for Machines with Feature-Based Rate-Distortion
Optimization [7.804710977378487]
With the steady improvement of neural networks, more and more multimedia data is not observed by humans anymore.
We propose a standard-compliant feature-based RDO (FRDO) that is designed to increase the coding performance.
We compare the proposed FRDO and its hybrid version HFRDO with different distortion measures in the feature space against the conventional RDO.
arXiv Detail & Related papers (2022-03-11T12:49:50Z) - Quantized Neural Networks via {-1, +1} Encoding Decomposition and
Acceleration [83.84684675841167]
We propose a novel encoding scheme using -1, +1 to decompose quantized neural networks (QNNs) into multi-branch binary networks.
We validate the effectiveness of our method on large-scale image classification, object detection, and semantic segmentation tasks.
arXiv Detail & Related papers (2021-06-18T03:11:15Z) - Learning Frequency-aware Dynamic Network for Efficient Super-Resolution [56.98668484450857]
This paper explores a novel frequency-aware dynamic network for dividing the input into multiple parts according to its coefficients in the discrete cosine transform (DCT) domain.
In practice, the high-frequency part will be processed using expensive operations and the lower-frequency part is assigned with cheap operations to relieve the computation burden.
Experiments conducted on benchmark SISR models and datasets show that the frequency-aware dynamic network can be employed for various SISR neural architectures.
arXiv Detail & Related papers (2021-03-15T12:54:26Z) - ALF: Autoencoder-based Low-rank Filter-sharing for Efficient
Convolutional Neural Networks [63.91384986073851]
We propose the autoencoder-based low-rank filter-sharing technique technique (ALF)
ALF shows a reduction of 70% in network parameters, 61% in operations and 41% in execution time, with minimal loss in accuracy.
arXiv Detail & Related papers (2020-07-27T09:01:22Z) - Efficient Integer-Arithmetic-Only Convolutional Neural Networks [87.01739569518513]
We replace conventional ReLU with Bounded ReLU and find that the decline is due to activation quantization.
Our integer networks achieve equivalent performance as the corresponding FPN networks, but have only 1/4 memory cost and run 2x faster on modern GPU.
arXiv Detail & Related papers (2020-06-21T08:23:03Z) - Deep HyperNetwork-Based MIMO Detection [10.433286163090179]
Conventional algorithms are either too complex to be practical or suffer from poor performance.
Recent approaches tried to address those challenges by implementing the detector as a deep neural network.
In this work, we address both issues by training an additional neural network (NN), referred to as the hypernetwork, which takes as input the channel matrix and generates the weights of the neural NN-based detector.
arXiv Detail & Related papers (2020-02-07T13:03:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.