Chroma Intra Prediction with attention-based CNN architectures
- URL: http://arxiv.org/abs/2006.15349v1
- Date: Sat, 27 Jun 2020 12:11:17 GMT
- Title: Chroma Intra Prediction with attention-based CNN architectures
- Authors: Marc G\'orriz, Saverio Blasi, Alan F. Smeaton, Noel E. O'Connor, Marta
Mrak
- Abstract summary: This paper proposes a new neural network architecture for cross-component intra-prediction.
The network uses a novel attention module to model spatial relations between reference and predicted samples.
- Score: 15.50693711359313
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Neural networks can be used in video coding to improve chroma
intra-prediction. In particular, usage of fully-connected networks has enabled
better cross-component prediction with respect to traditional linear models.
Nonetheless, state-of-the-art architectures tend to disregard the location of
individual reference samples in the prediction process. This paper proposes a
new neural network architecture for cross-component intra-prediction. The
network uses a novel attention module to model spatial relations between
reference and predicted samples. The proposed approach is integrated into the
Versatile Video Coding (VVC) prediction pipeline. Experimental results
demonstrate compression gains over the latest VVC anchor compared with
state-of-the-art chroma intra-prediction methods based on neural networks.
Related papers
- GINN-KAN: Interpretability pipelining with applications in Physics Informed Neural Networks [5.2969467015867915]
We introduce the concept of interpretability pipelineing, to incorporate multiple interpretability techniques to outperform each individual technique.
We evaluate two recent models selected for their potential to incorporate interpretability into standard neural network architectures.
We introduce a novel interpretable neural network GINN-KAN that synthesizes the advantages of both models.
arXiv Detail & Related papers (2024-08-27T04:57:53Z) - GNN-LoFI: a Novel Graph Neural Network through Localized Feature-based
Histogram Intersection [51.608147732998994]
Graph neural networks are increasingly becoming the framework of choice for graph-based machine learning.
We propose a new graph neural network architecture that substitutes classical message passing with an analysis of the local distribution of node features.
arXiv Detail & Related papers (2024-01-17T13:04:23Z) - Set-based Neural Network Encoding Without Weight Tying [91.37161634310819]
We propose a neural network weight encoding method for network property prediction.
Our approach is capable of encoding neural networks in a model zoo of mixed architecture.
We introduce two new tasks for neural network property prediction: cross-dataset and cross-architecture.
arXiv Detail & Related papers (2023-05-26T04:34:28Z) - Predictive Coding Based Multiscale Network with Encoder-Decoder LSTM for
Video Prediction [1.2537993038844142]
We present a multi-scale predictive coding model for future video frames prediction.
Our model employs a multi-scale approach (Coarse to Fine) where the higher level neurons generate coarser predictions (lower resolution)
We propose several improvements to the training strategy to mitigate the accumulation of prediction errors in long-term prediction.
arXiv Detail & Related papers (2022-12-22T12:15:37Z) - Pyramidal Predictive Network: A Model for Visual-frame Prediction Based
on Predictive Coding Theory [1.4610038284393165]
We propose a novel neural network model for the task of visual-frame prediction.
The model is composed of a series of recurrent and convolutional units forming the top-down and bottom-up streams.
It learns to predict future frames in a visual sequence, with ConvLSTMs on each layer in the network making local prediction from top to down.
arXiv Detail & Related papers (2022-08-15T06:28:34Z) - LHNN: Lattice Hypergraph Neural Network for VLSI Congestion Prediction [70.31656245793302]
lattice hypergraph (LH-graph) is a novel graph formulation for circuits.
LHNN constantly achieves more than 35% improvements compared with U-nets and Pix2Pix on the F1 score.
arXiv Detail & Related papers (2022-03-24T03:31:18Z) - Learning Cross-Scale Prediction for Efficient Neural Video Compression [30.051859347293856]
We present the first neural video that can compete with the latest coding standard H.266/VVC in terms of sRGB PSNR on UVG dataset for the low-latency mode.
We propose a novel cross-scale prediction module that achieves more effective motion compensation.
arXiv Detail & Related papers (2021-12-26T03:12:17Z) - CCasGNN: Collaborative Cascade Prediction Based on Graph Neural Networks [0.49269463638915806]
Cascade prediction aims at modeling information diffusion in the network.
Recent efforts devoted to combining network structure and sequence features by graph neural networks and recurrent neural networks.
We propose a novel method CCasGNN considering the individual profile, structural features, and sequence information.
arXiv Detail & Related papers (2021-12-07T11:37:36Z) - PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive
Learning [109.84770951839289]
We present PredRNN, a new recurrent network for learning visual dynamics from historical context.
We show that our approach obtains highly competitive results on three standard datasets.
arXiv Detail & Related papers (2021-03-17T08:28:30Z) - Probabilistic Graph Attention Network with Conditional Kernels for
Pixel-Wise Prediction [158.88345945211185]
We present a novel approach that advances the state of the art on pixel-level prediction in a fundamental aspect, i.e. structured multi-scale features learning and fusion.
We propose a probabilistic graph attention network structure based on a novel Attention-Gated Conditional Random Fields (AG-CRFs) model for learning and fusing multi-scale representations in a principled manner.
arXiv Detail & Related papers (2021-01-08T04:14:29Z) - A Deep-Unfolded Reference-Based RPCA Network For Video
Foreground-Background Separation [86.35434065681925]
This paper proposes a new deep-unfolding-based network design for the problem of Robust Principal Component Analysis (RPCA)
Unlike existing designs, our approach focuses on modeling the temporal correlation between the sparse representations of consecutive video frames.
Experimentation using the moving MNIST dataset shows that the proposed network outperforms a recently proposed state-of-the-art RPCA network in the task of video foreground-background separation.
arXiv Detail & Related papers (2020-10-02T11:40:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.