Efficient Motion Modelling with Variable-sized blocks from Hierarchical
Cuboidal Partitioning
- URL: http://arxiv.org/abs/2208.13137v1
- Date: Sun, 28 Aug 2022 04:13:58 GMT
- Title: Efficient Motion Modelling with Variable-sized blocks from Hierarchical
Cuboidal Partitioning
- Authors: Priyabrata Karmakar, Manzur Murshed, Manoranjan Paul, David Taubman
- Abstract summary: Motion modelling with block-based architecture has been widely used in video coding where a frame is divided into fixed-sized blocks that are motion compensated independently.
We have investigated the potential of cuboids in motion modelling against the fixed-sized blocks used in scalable video coding.
- Score: 24.100530697346155
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Motion modelling with block-based architecture has been widely used in video
coding where a frame is divided into fixed-sized blocks that are motion
compensated independently. This often leads to coding inefficiency as
fixed-sized blocks hardly align with the object boundaries. Although
hierarchical block-partitioning has been introduced to address this, the
increased number of motion vectors limits the benefit. Recently, approximate
segmentation of images with cuboidal partitioning has gained popularity. Not
only are the variable-sized rectangular segments (cuboids) readily amenable to
block-based image/video coding techniques, but they are also capable of
aligning well with the object boundaries. This is because cuboidal partitioning
is based on a homogeneity constraint, minimising the sum of squared errors
(SSE). In this paper, we have investigated the potential of cuboids in motion
modelling against the fixed-sized blocks used in scalable video coding.
Specifically, we have constructed motion-compensated current frame using the
cuboidal partitioning information of the anchor frame in a group-of-picture
(GOP). The predicted current frame has then been used as the base layer while
encoding the current frame as an enhancement layer using the scalable HEVC
encoder. Experimental results confirm 6.71%-10.90% bitrate savings on 4K video
sequences.
Related papers
- Object Segmentation-Assisted Inter Prediction for Versatile Video Coding [53.91821712591901]
We propose an object segmentation-assisted inter prediction method (SAIP), where objects in the reference frames are segmented by some advanced technologies.
With a proper indication, the object segmentation mask is translated from the reference frame to the current frame as the arbitrary-shaped partition of different regions.
We show that the proposed method achieves up to 1.98%, 1.14%, 0.79%, and on average 0.82%, 0.49%, 0.37% BD-rate reduction for common test sequences.
arXiv Detail & Related papers (2024-03-18T11:48:20Z) - IBVC: Interpolation-driven B-frame Video Compression [68.18440522300536]
B-frame video compression aims to adopt bi-directional motion estimation and motion compensation (MEMC) coding for middle frame reconstruction.
Previous learned approaches often directly extend neural P-frame codecs to B-frame relying on bi-directional optical-flow estimation.
We propose a simple yet effective structure called Interpolation-B-frame Video Compression (IBVC) to address these issues.
arXiv Detail & Related papers (2023-09-25T02:45:51Z) - H-VFI: Hierarchical Frame Interpolation for Videos with Large Motions [63.23985601478339]
We propose a simple yet effective solution, H-VFI, to deal with large motions in video frame.
H-VFI contributes a hierarchical video transformer to learn a deformable kernel in a coarse-to-fine strategy.
The advantage of such a progressive approximation is that the large motion frame problem can be predicted into several relatively simpler sub-tasks.
arXiv Detail & Related papers (2022-11-21T09:49:23Z) - Neighbor Correspondence Matching for Flow-based Video Frame Synthesis [90.14161060260012]
We introduce a neighbor correspondence matching (NCM) algorithm for flow-based frame synthesis.
NCM is performed in a current-frame-agnostic fashion to establish multi-scale correspondences in the spatial-temporal neighborhoods of each pixel.
coarse-scale module is designed to leverage neighbor correspondences to capture large motion, while the fine-scale module is more efficient to speed up the estimation process.
arXiv Detail & Related papers (2022-07-14T09:17:00Z) - Contributions to interframe coding [0.0]
We propose a new approach to reduce the number of vectors, using different block sizes as a function of the local characteristics of the image.
A second algorithm is proposed for an inter/intraframe coder.
arXiv Detail & Related papers (2022-03-31T10:36:25Z) - Exploring Motion Ambiguity and Alignment for High-Quality Video Frame
Interpolation [46.02120172459727]
We propose to relax the requirement of reconstructing an intermediate frame as close to the ground-truth (GT) as possible.
We develop a texture consistency loss (TCL) upon the assumption that the interpolated content should maintain similar structures with their counterparts in the given frames.
arXiv Detail & Related papers (2022-03-19T10:37:06Z) - Self-Supervised Learning of Perceptually Optimized Block Motion
Estimates for Video Compression [50.48504867843605]
We propose a search-free block motion estimation framework using a multi-stage convolutional neural network.
We deploy the multi-scale structural similarity (MS-SSIM) loss function to optimize the perceptual quality of the motion compensated predicted frames.
arXiv Detail & Related papers (2021-10-05T03:38:43Z) - Efficient Video Object Segmentation with Compressed Video [36.192735485675286]
We propose an efficient framework for semi-supervised video object segmentation by exploiting the temporal redundancy of the video.
Our method performs inference on selected vectors and makes predictions for other frames via propagation based on motion and residuals from the compressed video bitstream.
With STM with top-k filtering as our base model, we achieved highly competitive results on DAVIS16 and YouTube-VOS with substantial speedups of up to 4.9X with little loss in accuracy.
arXiv Detail & Related papers (2021-07-26T12:57:04Z) - SegBlocks: Block-Based Dynamic Resolution Networks for Real-Time
Segmentation [47.338987325018614]
SegBlocks dynamically adjusts the processing resolution of image regions based on their complexity.
A lightweight policy network, selecting the complex regions, is trained using reinforcement learning.
Our method reduces the number of floating-point operations of SwiftNet-RN18 by 60% and increases the inference speed by 50%.
arXiv Detail & Related papers (2020-11-24T11:05:07Z) - Blurry Video Frame Interpolation [57.77512131536132]
We propose a blurry video frame method to reduce blur motion and up-convert frame rate simultaneously.
Specifically, we develop a pyramid module to cyclically synthesize clear intermediate frames.
Our method performs favorably against state-of-the-art methods.
arXiv Detail & Related papers (2020-02-27T17:00:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.