Boosting Semantic Human Matting with Coarse Annotations
- URL: http://arxiv.org/abs/2004.04955v1
- Date: Fri, 10 Apr 2020 09:11:02 GMT
- Title: Boosting Semantic Human Matting with Coarse Annotations
- Authors: Jinlin Liu, Yuan Yao, Wendi Hou, Miaomiao Cui, Xuansong Xie, Changshui
Zhang, Xian-sheng Hua
- Abstract summary: coarse annotated human dataset is much easier to acquire and collect from the public dataset.
A matting refinement network takes in the unified mask and the input image to predict the final alpha matte.
- Score: 66.8725980604434
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Semantic human matting aims to estimate the per-pixel opacity of the
foreground human regions. It is quite challenging and usually requires user
interactive trimaps and plenty of high quality annotated data. Annotating such
kind of data is labor intensive and requires great skills beyond normal users,
especially considering the very detailed hair part of humans. In contrast,
coarse annotated human dataset is much easier to acquire and collect from the
public dataset. In this paper, we propose to use coarse annotated data coupled
with fine annotated data to boost end-to-end semantic human matting without
trimaps as extra input. Specifically, we train a mask prediction network to
estimate the coarse semantic mask using the hybrid data, and then propose a
quality unification network to unify the quality of the previous coarse mask
outputs. A matting refinement network takes in the unified mask and the input
image to predict the final alpha matte. The collected coarse annotated dataset
enriches our dataset significantly, allows generating high quality alpha matte
for real images. Experimental results show that the proposed method performs
comparably against state-of-the-art methods. Moreover, the proposed method can
be used for refining coarse annotated public dataset, as well as semantic
segmentation methods, which reduces the cost of annotating high quality human
data to a great extent.
Related papers
- Boosting General Trimap-free Matting in the Real-World Image [0.0]
We propose a network called textbfMulti-textbfFeature fusion-based textbfCoarse-to-fine Network textbf(MFC-Net).
Our method is significantly effective on both synthetic and real-world images, and the performance in the real-world dataset is far better than existing matting-free methods.
arXiv Detail & Related papers (2024-05-28T07:37:44Z) - SatSynth: Augmenting Image-Mask Pairs through Diffusion Models for Aerial Semantic Segmentation [69.42764583465508]
We explore the potential of generative image diffusion to address the scarcity of annotated data in earth observation tasks.
To the best of our knowledge, we are the first to generate both images and corresponding masks for satellite segmentation.
arXiv Detail & Related papers (2024-03-25T10:30:22Z) - RADiff: Controllable Diffusion Models for Radio Astronomical Maps
Generation [6.128112213696457]
RADiff is a generative approach based on conditional diffusion models trained over an annotated radio dataset.
We show that it is possible to generate fully-synthetic image-annotation pairs to automatically augment any annotated dataset.
arXiv Detail & Related papers (2023-07-05T16:04:44Z) - DiffuMask: Synthesizing Images with Pixel-level Annotations for Semantic
Segmentation Using Diffusion Models [68.21154597227165]
We show that it is possible to automatically obtain accurate semantic masks of synthetic images generated by the Off-the-shelf Stable Diffusion model.
Our approach, called DiffuMask, exploits the potential of the cross-attention map between text and image.
arXiv Detail & Related papers (2023-03-21T08:43:15Z) - Mask-Guided Image Person Removal with Data Synthesis [11.207512995742999]
We propose a novel idea to tackle these problems from the perspective of data synthesis.
Concerning the lack of dedicated dataset for image person removal, two dataset production methods are proposed to automatically generate images, masks and ground truths respectively.
A learning framework similar to local image degradation is proposed so that the masks can be used to guide the feature extraction process and more texture information can be gathered for final prediction.
arXiv Detail & Related papers (2022-09-29T15:58:17Z) - Alpha Matte Generation from Single Input for Portrait Matting [79.62140902232628]
The goal is to predict an alpha matte that identifies the effect of each pixel on the foreground subject.
Traditional approaches and most of the existing works utilized an additional input, e.g., trimap, background image, to predict alpha matte.
We introduce an additional input-free approach to perform portrait matting using Generative Adversarial Nets (GANs)
arXiv Detail & Related papers (2021-06-06T18:53:42Z) - Human De-occlusion: Invisible Perception and Recovery for Humans [26.404444296924243]
We tackle the problem of human de-occlusion which reasons about occluded segmentation masks and invisible appearance content of humans.
In particular, a two-stage framework is proposed to estimate the invisible portions and recover the content inside.
Our method performs over the state-of-the-art techniques in both tasks of mask completion and content recovery.
arXiv Detail & Related papers (2021-03-22T05:54:58Z) - Mask-based Data Augmentation for Semi-supervised Semantic Segmentation [3.946367634483361]
We propose a new approach for data augmentation, termed ComplexMix, which incorporates aspects of CutMix and ClassMix with improved performance.
The proposed approach has the ability to control the complexity of the augmented data while attempting to be semantically-correct.
Experimental results show that our method yields improvement over state-of-the-art methods on standard datasets for semantic image segmentation.
arXiv Detail & Related papers (2021-01-25T15:09:34Z) - Adversarial Semantic Data Augmentation for Human Pose Estimation [96.75411357541438]
We propose Semantic Data Augmentation (SDA), a method that augments images by pasting segmented body parts with various semantic granularity.
We also propose Adversarial Semantic Data Augmentation (ASDA), which exploits a generative network to dynamiclly predict tailored pasting configuration.
State-of-the-art results are achieved on challenging benchmarks.
arXiv Detail & Related papers (2020-08-03T07:56:04Z) - Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training.
We experimentally verify that the new dataset can significantly improve the ability of the learned FER model.
To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.