Distortion-Aware Loop Filtering of Intra 360^o Video Coding with
Equirectangular Projection
- URL: http://arxiv.org/abs/2202.09802v1
- Date: Sun, 20 Feb 2022 12:00:18 GMT
- Title: Distortion-Aware Loop Filtering of Intra 360^o Video Coding with
Equirectangular Projection
- Authors: Pingping Zhang, Xu Wang, Linwei Zhu, Yun Zhang, Shiqi Wang, Sam Kwong
- Abstract summary: We propose a distortion-aware loop filtering model to improve the performance of intra coding for 360$o$ videos projected via equirectangular projection (ERP) format.
Our proposed module analyzes content characteristics based on a coding unit (CU) partition mask and processes them through partial convolution to activate the specified area.
- Score: 81.63407194858854
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose a distortion-aware loop filtering model to improve
the performance of intra coding for 360$^o$ videos projected via
equirectangular projection (ERP) format. To enable the awareness of distortion,
our proposed module analyzes content characteristics based on a coding unit
(CU) partition mask and processes them through partial convolution to activate
the specified area. The feature recalibration module, which leverages cascaded
residual channel-wise attention blocks (RCABs) to adjust the inter-channel and
intra-channel features automatically, is capable of adapting with different
quality levels. The perceptual geometry optimization combining with weighted
mean squared error (WMSE) and the perceptual loss guarantees both the local
field of view (FoV) and global image reconstruction with high quality.
Extensive experimental results show that our proposed scheme achieves
significant bitrate savings compared with the anchor (HM + 360Lib), leading to
8.9%, 9.0%, 7.1% and 7.4% on average bit rate reductions in terms of PSNR,
WPSNR, and PSNR of two viewports for luminance component of 360^o videos,
respectively.
Related papers
- Simple Baselines for Projection-based Full-reference and No-reference
Point Cloud Quality Assessment [60.2709006613171]
We propose simple baselines for projection-based point cloud quality assessment (PCQA)
We use multi-projections obtained via a common cube-like projection process from the point clouds for both full-reference (FR) and no-reference (NR) PCQA tasks.
Taking part in the ICIP 2023 PCVQA Challenge, we succeeded in achieving the top spot in four out of the five competition tracks.
arXiv Detail & Related papers (2023-10-26T04:42:57Z) - ConsistentNeRF: Enhancing Neural Radiance Fields with 3D Consistency for
Sparse View Synthesis [99.06490355990354]
We propose ConsistentNeRF, a method that leverages depth information to regularize both multi-view and single-view 3D consistency among pixels.
Our approach can considerably enhance model performance in sparse view conditions, achieving improvements of up to 94% in PSNR, in SSIM, and 31% in LPIPS.
arXiv Detail & Related papers (2023-05-18T15:18:01Z) - FAMLP: A Frequency-Aware MLP-Like Architecture For Domain Generalization [73.41395947275473]
We propose a novel frequency-aware architecture, in which the domain-specific features are filtered out in the transformed frequency domain.
Experiments on three benchmarks demonstrate significant performance, outperforming the state-of-the-art methods by a margin of 3%, 4% and 9%, respectively.
arXiv Detail & Related papers (2022-03-24T07:26:29Z) - Video Coding for Machines with Feature-Based Rate-Distortion
Optimization [7.804710977378487]
With the steady improvement of neural networks, more and more multimedia data is not observed by humans anymore.
We propose a standard-compliant feature-based RDO (FRDO) that is designed to increase the coding performance.
We compare the proposed FRDO and its hybrid version HFRDO with different distortion measures in the feature space against the conventional RDO.
arXiv Detail & Related papers (2022-03-11T12:49:50Z) - Recursive Self-Improvement for Camera Image and Signal Processing
Pipeline [6.318974730864278]
Current camera image and signal processing pipelines (ISPs) tend to apply a single filter that is uniformly applied to the entire image.
This despite the fact that most acquired camera images have spatially heterogeneous artifacts.
We present a deep reinforcement learning model that works in learned latent subspaces.
arXiv Detail & Related papers (2021-11-15T02:23:40Z) - A Global Appearance and Local Coding Distortion based Fusion Framework
for CNN based Filtering in Video Coding [15.778380865885842]
In-loop filtering is used in video coding to process the reconstructed frame in order to remove blocking artifacts.
In this paper, we address the filtering problem from two aspects, global appearance restoration for disrupted texture and local coding distortion restoration caused by fixed pipeline of coding.
A three-stream global appearance and local coding distortion based fusion network is developed with a high-level global feature stream, a high-level local feature stream and a low-level local feature stream.
arXiv Detail & Related papers (2021-06-24T03:08:44Z) - DeepCompress: Efficient Point Cloud Geometry Compression [1.808877001896346]
We propose a more efficient deep learning-based encoder architecture for point clouds compression.
We show that incorporating the learned activation function from Efficient Neural Image Compression (CENIC) yields dramatic gains in efficiency and performance.
Our proposed modifications outperform the baseline approaches by a small margin in terms of Bjontegard delta rate and PSNR values.
arXiv Detail & Related papers (2021-06-02T23:18:11Z) - Robust 360-8PA: Redesigning The Normalized 8-point Algorithm for 360-FoV
Images [53.11097060367591]
We present a novel strategy for estimating an essential matrix from 360-FoV images in spherical projection.
We show that our normalization can increase the camera pose accuracy by about 20% without significantly overhead the time.
arXiv Detail & Related papers (2021-04-22T07:23:11Z) - Reduced Reference Perceptual Quality Model and Application to Rate
Control for 3D Point Cloud Compression [61.110938359555895]
In rate-distortion optimization, the encoder settings are determined by maximizing a reconstruction quality measure subject to a constraint on the bit rate.
We propose a linear perceptual quality model whose variables are the V-PCC geometry and color quantization parameters.
Subjective quality tests with 400 compressed 3D point clouds show that the proposed model correlates well with the mean opinion score.
We show that for the same target bit rate, ratedistortion optimization based on the proposed model offers higher perceptual quality than rate-distortion optimization based on exhaustive search with a point-to-point objective quality metric.
arXiv Detail & Related papers (2020-11-25T12:42:02Z) - ODE-CNN: Omnidirectional Depth Extension Networks [43.40308168978984]
We propose a low-cost 3D sensing system that combines an omnidirectional camera with a calibrated projective depth camera.
To accurately recover the missing depths, we design an omnidirectional depth extension convolutional neural network.
ODE-CNN significantly outperforms (relatively 33% reduction in-depth error) other state-of-the-art (SoTA) methods.
arXiv Detail & Related papers (2020-07-03T03:14:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.