Real-Time High-Resolution Background Matting
- URL: http://arxiv.org/abs/2012.07810v1
- Date: Mon, 14 Dec 2020 18:43:32 GMT
- Title: Real-Time High-Resolution Background Matting
- Authors: Shanchuan Lin, Andrey Ryabtsev, Soumyadip Sengupta, Brian Curless,
Steve Seitz, and Ira Kemelmacher-Shlizerman
- Abstract summary: We introduce a real-time, high-resolution background replacement technique which operates at 30 fps in 4K resolution, and 60 fps for HD on a modern GPU.
Our approach yields higher quality results compared to the previous state-of-the-art in background matting, while simultaneously yielding a dramatic boost in both speed and resolution.
- Score: 19.140664310700107
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce a real-time, high-resolution background replacement technique
which operates at 30fps in 4K resolution, and 60fps for HD on a modern GPU. Our
technique is based on background matting, where an additional frame of the
background is captured and used in recovering the alpha matte and the
foreground layer. The main challenge is to compute a high-quality alpha matte,
preserving strand-level hair details, while processing high-resolution images
in real-time. To achieve this goal, we employ two neural networks; a base
network computes a low-resolution result which is refined by a second network
operating at high-resolution on selective patches. We introduce two largescale
video and image matting datasets: VideoMatte240K and PhotoMatte13K/85. Our
approach yields higher quality results compared to the previous
state-of-the-art in background matting, while simultaneously yielding a
dramatic boost in both speed and resolution.
Related papers
- Elevating Flow-Guided Video Inpainting with Reference Generation [50.03502211226332]
Video inpainting (VI) is a challenging task that requires effective propagation of observable content across frames while simultaneously generating new content not present in the original video.
We propose a robust and practical VI framework that leverages a large generative model for reference generation in combination with an advanced pixel propagation algorithm.
Our method not only significantly enhances frame-level quality for object removal but also synthesizes new content in the missing areas based on user-provided text prompts.
arXiv Detail & Related papers (2024-12-12T06:13:00Z) - VISION-XL: High Definition Video Inverse Problem Solver using Latent Image Diffusion Models [58.464465016269614]
In this paper, we propose a framework for solving high-definition video inverse problems using latent image diffusion models.
Our approach leverages latent-space diffusion models to achieve enhanced video quality and resolution.
Unlike previous methods, our approach supports multiple aspect ratios and delivers HD-resolution reconstructions in under 2.5 minutes on a single GPU.
arXiv Detail & Related papers (2024-11-29T08:10:49Z) - Hierarchical Patch Diffusion Models for High-Resolution Video Generation [50.42746357450949]
We develop deep context fusion, which propagates context information from low-scale to high-scale patches in a hierarchical manner.
We also propose adaptive computation, which allocates more network capacity and computation towards coarse image details.
The resulting model sets a new state-of-the-art FVD score of 66.32 and Inception Score of 87.68 in class-conditional video generation.
arXiv Detail & Related papers (2024-06-12T01:12:53Z) - DART: Depth-Enhanced Accurate and Real-Time Background Matting [11.78381754863757]
Matting with a static background, often referred to as Background Matting" (BGM), has garnered significant attention within the computer vision community.
We leverage the rich depth information provided by the RGB-Depth (RGB-D) cameras to enhance background matting performance in real-time.
arXiv Detail & Related papers (2024-02-24T14:10:17Z) - Shakes on a Plane: Unsupervised Depth Estimation from Unstabilized
Photography [54.36608424943729]
We show that in a ''long-burst'', forty-two 12-megapixel RAW frames captured in a two-second sequence, there is enough parallax information from natural hand tremor alone to recover high-quality scene depth.
We devise a test-time optimization approach that fits a neural RGB-D representation to long-burst data and simultaneously estimates scene depth and camera motion.
arXiv Detail & Related papers (2022-12-22T18:54:34Z) - SwiftSRGAN -- Rethinking Super-Resolution for Efficient and Real-time
Inference [0.0]
We present an architecture that is faster and smaller in terms of its memory footprint.
A real-time super-resolution enables streaming high resolution media content even under poor bandwidth conditions.
arXiv Detail & Related papers (2021-11-29T04:20:15Z) - Robust High-Resolution Video Matting with Temporal Guidance [14.9739044990367]
We introduce a robust, real-time, high-resolution human video matting method that achieves new state-of-the-art performance.
Our method is much lighter than previous approaches and can process 4K at 76 FPS and HD at 104 FPS on an Nvidia GTX 1080Ti GPU.
arXiv Detail & Related papers (2021-08-25T23:48:15Z) - Single image deep defocus estimation and its applications [82.93345261434943]
We train a deep neural network to classify image patches into one of the 20 levels of blurriness.
The trained model is used to determine the patch blurriness which is then refined by applying an iterative weighted guided filter.
The result is a defocus map that carries the information of the degree of blurriness for each pixel.
arXiv Detail & Related papers (2021-07-30T06:18:16Z) - Robust Consistent Video Depth Estimation [65.53308117778361]
We present an algorithm for estimating consistent dense depth maps and camera poses from a monocular video.
Our algorithm combines two complementary techniques: (1) flexible deformation-splines for low-frequency large-scale alignment and (2) geometry-aware depth filtering for high-frequency alignment of fine depth details.
In contrast to prior approaches, our method does not require camera poses as input and achieves robust reconstruction for challenging hand-held cell phone captures containing a significant amount of noise, shake, motion blur, and rolling shutter deformations.
arXiv Detail & Related papers (2020-12-10T18:59:48Z) - High-Resolution Deep Image Matting [39.72708676319803]
HDMatt is a first deep learning based image matting approach for high-resolution inputs.
Our proposed method sets new state-of-the-art performance on Adobe Image Matting and AlphaMatting benchmarks.
arXiv Detail & Related papers (2020-09-14T17:53:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.