Combining Progressive Rethinking and Collaborative Learning: A Deep
Framework for In-Loop Filtering
- URL: http://arxiv.org/abs/2001.05651v3
- Date: Wed, 31 Mar 2021 09:33:27 GMT
- Title: Combining Progressive Rethinking and Collaborative Learning: A Deep
Framework for In-Loop Filtering
- Authors: Dezhao Wang, Sifeng Xia, Wenhan Yang, and Jiaying Liu
- Abstract summary: We design a deep network with both progressive rethinking and collaborative learning mechanisms to improve quality of the reconstructed intra-frames and inter-frames.
Our PRN with intra-frame side information provides 9.0% BD-rate reduction on average compared to HEVC baseline under All-intra (AI) configuration.
- Score: 67.22506488158707
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we aim to address issues of (1) joint spatial-temporal
modeling and (2) side information injection for deep-learning based in-loop
filter. For (1), we design a deep network with both progressive rethinking and
collaborative learning mechanisms to improve quality of the reconstructed
intra-frames and inter-frames, respectively. For intra coding, a Progressive
Rethinking Network (PRN) is designed to simulate the human decision mechanism
for effective spatial modeling. Our designed block introduces an additional
inter-block connection to bypass a high-dimensional informative feature before
the bottleneck module across blocks to review the complete past memorized
experiences and rethinks progressively. For inter coding, the current
reconstructed frame interacts with reference frames (peak quality frame and the
nearest adjacent frame) collaboratively at the feature level. For (2), we
extract both intra-frame and inter-frame side information for better context
modeling. A coarse-to-fine partition map based on HEVC partition trees is built
as the intra-frame side information. Furthermore, the warped features of the
reference frames are offered as the inter-frame side information. Our PRN with
intra-frame side information provides 9.0% BD-rate reduction on average
compared to HEVC baseline under All-intra (AI) configuration. While under
Low-Delay B (LDB), Low-Delay P (LDP) and Random Access (RA) configuration, our
PRN with inter-frame side information provides 9.0%, 10.6% and 8.0% BD-rate
reduction on average respectively. Our project webpage is
https://dezhao-wang.github.io/PRN-v2/.
Related papers
- Unite-Divide-Unite: Joint Boosting Trunk and Structure for High-accuracy
Dichotomous Image Segmentation [48.995367430746086]
High-accuracy Dichotomous Image rendering (DIS) aims to pinpoint category-agnostic foreground objects from natural scenes.
We introduce a novel Unite-Divide-Unite Network (UDUN) that restructures and bipartitely arranges complementary features to boost the effectiveness of trunk and structure identification.
Using 1024*1024 input, our model enables real-time inference at 65.3 fps with ResNet-18.
arXiv Detail & Related papers (2023-07-26T09:04:35Z) - Learning Target-aware Representation for Visual Tracking via Informative
Interactions [49.552877881662475]
We introduce a novel backbone architecture to improve target-perception ability of feature representation for tracking.
The proposed GIM module and InBN mechanism are general and applicable to different backbone types including CNN and Transformer.
arXiv Detail & Related papers (2022-01-07T16:22:27Z) - Deep Recurrent Neural Network with Multi-scale Bi-directional
Propagation for Video Deblurring [36.94523101375519]
We propose a deep Recurrent Neural Network with Multi-scale Bi-directional Propagation (RNN-MBP) to propagate and gather information from unaligned neighboring frames for better video deblurring.
To better evaluate the proposed algorithm and existing state-of-the-art methods on real-world blurry scenes, we also create a Real-World Blurry Video dataset.
The proposed algorithm performs favorably against the state-of-the-art methods on three typical benchmarks.
arXiv Detail & Related papers (2021-12-09T11:02:56Z) - Asymmetric Bilateral Motion Estimation for Video Frame Interpolation [50.44508853885882]
We propose a novel video frame algorithm based on asymmetric bilateral motion estimation (ABME)
We predict symmetric bilateral motion fields to interpolate an anchor frame.
We estimate asymmetric bilateral motions fields from the anchor frame to the input frames.
Third, we use the asymmetric fields to warp the input frames backward and reconstruct the intermediate frame.
arXiv Detail & Related papers (2021-08-15T21:11:35Z) - Group-based Bi-Directional Recurrent Wavelet Neural Networks for Video
Super-Resolution [4.9136996406481135]
Video super-resolution (VSR) aims to estimate a high-resolution (HR) frame from a low-resolution (LR) frames.
Key challenge for VSR lies in the effective exploitation of spatial correlation in an intra-frame and temporal dependency between consecutive frames.
arXiv Detail & Related papers (2021-06-14T06:36:13Z) - PDWN: Pyramid Deformable Warping Network for Video Interpolation [11.62213584807003]
We propose a light but effective model, called Pyramid Deformable Warping Network (PDWN)
PDWN uses a pyramid structure to generate DConv offsets of the unknown middle frame with respect to the known frames through coarse-to-fine successive refinements.
Our method achieves better or on-par accuracy compared to state-of-the-art models on multiple datasets.
arXiv Detail & Related papers (2021-04-04T02:08:57Z) - Multi-Stage Progressive Image Restoration [167.6852235432918]
We propose a novel synergistic design that can optimally balance these competing goals.
Our main proposal is a multi-stage architecture, that progressively learns restoration functions for the degraded inputs.
The resulting tightly interlinked multi-stage architecture, named as MPRNet, delivers strong performance gains on ten datasets.
arXiv Detail & Related papers (2021-02-04T18:57:07Z) - ASAP-Net: Attention and Structure Aware Point Cloud Sequence
Segmentation [49.15948235059343]
We further improve point-temporal cloud feature with a flexible module called ASAP.
Our ASAP module contains an attentive temporal embedding layer to fuse the relatively informative local features across frames in a recurrent fashion.
We show the generalization ability of the proposed ASAP module with different computation backbone networks for point cloud sequence segmentation.
arXiv Detail & Related papers (2020-08-12T07:37:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.