Attention-Based Neural Networks for Chroma Intra Prediction in Video
Coding
- URL: http://arxiv.org/abs/2102.04993v1
- Date: Tue, 9 Feb 2021 18:01:22 GMT
- Title: Attention-Based Neural Networks for Chroma Intra Prediction in Video
Coding
- Authors: Marc G\'orriz, Saverio Blasi, Alan F. Smeaton, Noel E. O'Connor, Marta
Mrak
- Abstract summary: This work focuses on reducing the complexity of attention-based architectures for chroma intra-prediction.
A novel size-agnostic multi-model approach is proposed to reduce the complexity of the inference process.
A collection of simplifications is presented in this paper, to further reduce the complexity overhead of the proposed prediction architecture.
- Score: 13.638411611516172
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Neural networks can be successfully used to improve several modules of
advanced video coding schemes. In particular, compression of colour components
was shown to greatly benefit from usage of machine learning models, thanks to
the design of appropriate attention-based architectures that allow the
prediction to exploit specific samples in the reference region. However, such
architectures tend to be complex and computationally intense, and may be
difficult to deploy in a practical video coding pipeline. This work focuses on
reducing the complexity of such methodologies, to design a set of simplified
and cost-effective attention-based architectures for chroma intra-prediction. A
novel size-agnostic multi-model approach is proposed to reduce the complexity
of the inference process. The resulting simplified architecture is still
capable of outperforming state-of-the-art methods. Moreover, a collection of
simplifications is presented in this paper, to further reduce the complexity
overhead of the proposed prediction architecture. Thanks to these
simplifications, a reduction in the number of parameters of around 90% is
achieved with respect to the original attention-based methodologies.
Simplifications include a framework for reducing the overhead of the
convolutional operations, a simplified cross-component processing model
integrated into the original architecture, and a methodology to perform
integer-precision approximations with the aim to obtain fast and hardware-aware
implementations. The proposed schemes are integrated into the Versatile Video
Coding (VVC) prediction pipeline, retaining compression efficiency of
state-of-the-art chroma intra-prediction methods based on neural networks,
while offering different directions for significantly reducing coding
complexity.
Related papers
- Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - Mechanistic Design and Scaling of Hybrid Architectures [114.3129802943915]
We identify and test new hybrid architectures constructed from a variety of computational primitives.
We experimentally validate the resulting architectures via an extensive compute-optimal and a new state-optimal scaling law analysis.
We find MAD synthetics to correlate with compute-optimal perplexity, enabling accurate evaluation of new architectures.
arXiv Detail & Related papers (2024-03-26T16:33:12Z) - Neural Architecture Codesign for Fast Bragg Peak Analysis [1.7081438846690533]
We develop an automated pipeline to streamline neural architecture codesign for fast, real-time Bragg peak analysis in microscopy.
Our method employs neural architecture search and AutoML to enhance these models, including hardware costs, leading to the discovery of more hardware-efficient neural architectures.
arXiv Detail & Related papers (2023-12-10T19:42:18Z) - Deep Equilibrium Assisted Block Sparse Coding of Inter-dependent
Signals: Application to Hyperspectral Imaging [71.57324258813675]
A dataset of inter-dependent signals is defined as a matrix whose columns demonstrate strong dependencies.
A neural network is employed to act as structure prior and reveal the underlying signal interdependencies.
Deep unrolling and Deep equilibrium based algorithms are developed, forming highly interpretable and concise deep-learning-based architectures.
arXiv Detail & Related papers (2022-03-29T21:00:39Z) - SIRe-Networks: Skip Connections over Interlaced Multi-Task Learning and
Residual Connections for Structure Preserving Object Classification [28.02302915971059]
In this paper, we introduce an interlaced multi-task learning strategy, defined SIRe, to reduce the vanishing gradient in relation to the object classification task.
The presented methodology directly improves a convolutional neural network (CNN) by enforcing the input image structure preservation through auto-encoders.
To validate the presented methodology, a simple CNN and various implementations of famous networks are extended via the SIRe strategy and extensively tested on the CIFAR100 dataset.
arXiv Detail & Related papers (2021-10-06T13:54:49Z) - Improved CNN-based Learning of Interpolation Filters for Low-Complexity
Inter Prediction in Video Coding [5.46121027847413]
This paper introduces a novel explainable neural network-based inter-prediction scheme.
A novel training framework enables each network branch to resemble a specific fractional shift.
When implemented in the context of the Versatile Video Coding (VVC) test model, 0.77%, 1.27% and 2.25% BD-rate savings can be achieved.
arXiv Detail & Related papers (2021-06-16T16:48:01Z) - Efficient Micro-Structured Weight Unification and Pruning for Neural
Network Compression [56.83861738731913]
Deep Neural Network (DNN) models are essential for practical applications, especially for resource limited devices.
Previous unstructured or structured weight pruning methods can hardly truly accelerate inference.
We propose a generalized weight unification framework at a hardware compatible micro-structured level to achieve high amount of compression and acceleration.
arXiv Detail & Related papers (2021-06-15T17:22:59Z) - MetaSDF: Meta-learning Signed Distance Functions [85.81290552559817]
Generalizing across shapes with neural implicit representations amounts to learning priors over the respective function space.
We formalize learning of a shape space as a meta-learning problem and leverage gradient-based meta-learning algorithms to solve this task.
arXiv Detail & Related papers (2020-06-17T05:14:53Z) - Analytic Simplification of Neural Network based Intra-Prediction Modes
for Video Compression [10.08097582267397]
This paper presents two ways to derive simplified intra-prediction from learnt models.
It shows that these streamlined techniques can lead to efficient compression solutions.
arXiv Detail & Related papers (2020-04-23T10:25:54Z) - HCM: Hardware-Aware Complexity Metric for Neural Network Architectures [6.556553154231475]
This paper introduces a hardware-aware complexity metric that aims to assist the system designer of the neural network architectures.
We demonstrate how the proposed metric can help evaluate different design alternatives of neural network models on resource-restricted devices.
arXiv Detail & Related papers (2020-04-19T16:42:51Z) - Structured Sparsification with Joint Optimization of Group Convolution
and Channel Shuffle [117.95823660228537]
We propose a novel structured sparsification method for efficient network compression.
The proposed method automatically induces structured sparsity on the convolutional weights.
We also address the problem of inter-group communication with a learnable channel shuffle mechanism.
arXiv Detail & Related papers (2020-02-19T12:03:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.