Efficient Portrait Matte Creation With Layer Diffusion and Connectivity Priors
- URL: http://arxiv.org/abs/2501.16147v1
- Date: Mon, 27 Jan 2025 15:41:19 GMT
- Title: Efficient Portrait Matte Creation With Layer Diffusion and Connectivity Priors
- Authors: Zhiyuan Lu, Hao Lu, Hua Huang,
- Abstract summary: This work shows that one can leverage text prompts to generate high-quality portrait foregrounds and extract latent portrait mattes.
A large-scale portrait matting dataset is created, termed LD-Portrait-20K, with $20,051$ portrait foregrounds and high-quality alpha mattes.
The dataset also contributes to state-of-the-art video portrait matting, implemented by simple video segmentation and a trimap-based image matting model trained on this dataset.
- Score: 16.916645195696137
- License:
- Abstract: Learning effective deep portrait matting models requires training data of both high quality and large quantity. Neither quality nor quantity can be easily met for portrait matting, however. Since the most accurate ground-truth portrait mattes are acquired in front of the green screen, it is almost impossible to harvest a large-scale portrait matting dataset in reality. This work shows that one can leverage text prompts and the recent Layer Diffusion model to generate high-quality portrait foregrounds and extract latent portrait mattes. However, the portrait mattes cannot be readily in use due to significant generation artifacts. Inspired by the connectivity priors observed in portrait images, that is, the border of portrait foregrounds always appears connected, a connectivity-aware approach is introduced to refine portrait mattes. Building on this, a large-scale portrait matting dataset is created, termed LD-Portrait-20K, with $20,051$ portrait foregrounds and high-quality alpha mattes. Extensive experiments demonstrated the value of the LD-Portrait-20K dataset, with models trained on it significantly outperforming those trained on other datasets. In addition, comparisons with the chroma keying algorithm and an ablation study on dataset capacity further confirmed the effectiveness of the proposed matte creation approach. Further, the dataset also contributes to state-of-the-art video portrait matting, implemented by simple video segmentation and a trimap-based image matting model trained on this dataset.
Related papers
- Adversarially-Guided Portrait Matting [0.0]
We present a method for generating alpha mattes using a limited data source.
We pretrain a novel transformerbased model (StyleMatte) on portrait datasets.
We utilize this model to provide image-mask pairs for the StyleGAN3-based network (StyleMatteGAN)
arXiv Detail & Related papers (2023-05-04T16:45:04Z) - EasyPortrait -- Face Parsing and Portrait Segmentation Dataset [79.16635054977068]
Video conferencing apps have become functional by accomplishing such computer vision-based features as real-time background removal and face beautification.
We create a new dataset, EasyPortrait, for these tasks simultaneously.
It contains 40,000 primarily indoor photos repeating video meeting scenarios with 13,705 unique users and fine-grained segmentation masks separated into 9 classes.
arXiv Detail & Related papers (2023-04-26T12:51:34Z) - Self-supervised Matting-specific Portrait Enhancement and Generation [40.444011984347505]
We use StyleGAN to explore the latent space of GAN models.
We optimize multi-scale latent vectors in the latent spaces under four tailored losses.
We show that the proposed method can refine real portrait images for arbitrary matting models.
arXiv Detail & Related papers (2022-08-13T09:00:02Z) - CtlGAN: Few-shot Artistic Portraits Generation with Contrastive Transfer
Learning [77.27821665339492]
CtlGAN is a new few-shot artistic portraits generation model with a novel contrastive transfer learning strategy.
We adapt a pretrained StyleGAN in the source domain to a target artistic domain with no more than 10 artistic faces.
We propose a new encoder which embeds real faces into Z+ space and proposes a dual-path training strategy to better cope with the adapted decoder.
arXiv Detail & Related papers (2022-03-16T13:28:17Z) - Alpha Matte Generation from Single Input for Portrait Matting [79.62140902232628]
The goal is to predict an alpha matte that identifies the effect of each pixel on the foreground subject.
Traditional approaches and most of the existing works utilized an additional input, e.g., trimap, background image, to predict alpha matte.
We introduce an additional input-free approach to perform portrait matting using Generative Adversarial Nets (GANs)
arXiv Detail & Related papers (2021-06-06T18:53:42Z) - Privacy-Preserving Portrait Matting [73.98225485513905]
We present P3M-10k, the first large-scale anonymized benchmark for Privacy-Preserving Portrait Matting.
P3M-10k consists of 10,000 high-resolution face-blurred portrait images along with high-quality alpha mattes.
We propose P3M-Net, which leverages the power of a unified framework for both semantic perception and detail matting.
arXiv Detail & Related papers (2021-04-29T09:20:19Z) - Smart Scribbles for Image Mating [90.18035889903909]
We propose an interactive framework, referred to as smart scribbles, to guide users to draw few scribbles on the input images.
It infers the most informative regions of an image for drawing scribbles to indicate different categories.
It then spreads these scribbles to the rest of the image via our well-designed two-phase propagation.
arXiv Detail & Related papers (2021-03-31T13:30:49Z) - Bridging Composite and Real: Towards End-to-end Deep Image Matting [88.79857806542006]
We study the roles of semantics and details for image matting.
We propose a novel Glance and Focus Matting network (GFM), which employs a shared encoder and two separate decoders.
Comprehensive empirical studies have demonstrated that GFM outperforms state-of-the-art methods.
arXiv Detail & Related papers (2020-10-30T10:57:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.