SGM-Net: Semantic Guided Matting Net
- URL: http://arxiv.org/abs/2208.07496v1
- Date: Tue, 16 Aug 2022 01:58:25 GMT
- Title: SGM-Net: Semantic Guided Matting Net
- Authors: Qing Song, Wenfeng Sun, Donghan Yang, Mengjie Hu, Chun Liu
- Abstract summary: We propose a module to generate foreground probability map and add it to MODNet to obtain Semantic Guided Matting Net (SGM-Net)
Under the condition of only one image, we can realize the human matting task.
- Score: 5.126872642595207
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human matting refers to extracting human parts from natural images with high
quality, including human detail information such as hair, glasses, hat, etc.
This technology plays an essential role in image synthesis and visual effects
in the film industry. When the green screen is not available, the existing
human matting methods need the help of additional inputs (such as trimap,
background image, etc.), or the model with high computational cost and complex
network structure, which brings great difficulties to the application of human
matting in practice. To alleviate such problems, most existing methods (such as
MODNet) use multi-branches to pave the way for matting through segmentation,
but these methods do not make full use of the image features and only utilize
the prediction results of the network as guidance information. Therefore, we
propose a module to generate foreground probability map and add it to MODNet to
obtain Semantic Guided Matting Net (SGM-Net). Under the condition of only one
image, we can realize the human matting task. We verify our method on the
P3M-10k dataset. Compared with the benchmark, our method has significantly
improved in various evaluation indicators.
Related papers
- Fine-Grained Multi-View Hand Reconstruction Using Inverse Rendering [11.228453237603834]
We present a novel fine-grained multi-view hand mesh reconstruction method that leverages inverse rendering to restore hand poses and intricate details.
We also introduce a novel Hand Albedo and Mesh (HAM) optimization module to refine both the hand mesh and textures.
Our proposed approach outperforms the state-of-the-art methods on both reconstruction accuracy and rendering quality.
arXiv Detail & Related papers (2024-07-08T07:28:24Z) - HR Human: Modeling Human Avatars with Triangular Mesh and High-Resolution Textures from Videos [52.23323966700072]
We present a framework for acquiring human avatars that are attached with high-resolution physically-based material textures and mesh from monocular video.
Our method introduces a novel information fusion strategy to combine the information from the monocular video and synthesize virtual multi-view images.
Experiments show that our approach outperforms previous representations in terms of high fidelity, and this explicit result supports deployment on common triangulars.
arXiv Detail & Related papers (2024-05-18T11:49:09Z) - Towards Label-Efficient Human Matting: A Simple Baseline for Weakly Semi-Supervised Trimap-Free Human Matting [50.99997483069828]
We introduce a new learning paradigm, weakly semi-supervised human matting (WSSHM)
WSSHM uses a small amount of expensive matte labels and a large amount of budget-friendly segmentation labels to save the annotation cost and resolve the domain generalization problem.
Our training method is also easily applicable to real-time models, achieving competitive accuracy with breakneck inference speed.
arXiv Detail & Related papers (2024-04-01T04:53:06Z) - MOCA: Self-supervised Representation Learning by Predicting Masked Online Codebook Assignments [72.6405488990753]
Self-supervised learning can be used for mitigating the greedy needs of Vision Transformer networks.
We propose a single-stage and standalone method, MOCA, which unifies both desired properties.
We achieve new state-of-the-art results on low-shot settings and strong experimental results in various evaluation protocols.
arXiv Detail & Related papers (2023-07-18T15:46:20Z) - Robust Human Matting via Semantic Guidance [35.374012964806745]
We develop a fast yet accurate human matting framework, named Semantic Guided Human Matting (SGHM)
It builds on a semantic human segmentation network and introduces a light-weight matting module with only marginal computational cost.
Our experiments show that trained with merely 200 matting images, our method can generalize well to real-world datasets.
arXiv Detail & Related papers (2022-10-11T07:25:33Z) - PP-Matting: High-Accuracy Natural Image Matting [11.68134059283327]
PP-Matting is a trimap-free architecture that can achieve high-accuracy natural image matting.
Our method applies a high-resolution detail branch (HRDB) that extracts fine-grained details of the foreground.
Also, we propose a semantic context branch (SCB) that adopts a semantic segmentation subtask.
arXiv Detail & Related papers (2022-04-20T12:54:06Z) - Virtual Multi-Modality Self-Supervised Foreground Matting for
Human-Object Interaction [18.14237514372724]
We propose a Virtual Multi-modality Foreground Matting (VMFM) method to learn human-object interactive foreground.
VMFM method requires no additional inputs, e.g. trimap or known background.
We reformulate foreground matting as a self-supervised multi-modality problem.
arXiv Detail & Related papers (2021-10-07T09:03:01Z) - Hand Image Understanding via Deep Multi-Task Learning [34.515382305252814]
We propose a novel Hand Image Understanding (HIU) framework to extract comprehensive information of the hand object from a single RGB image.
Our method significantly outperforms the state-of-the-art approaches on various widely-used datasets.
arXiv Detail & Related papers (2021-07-24T16:28:06Z) - Deep Automatic Natural Image Matting [82.56853587380168]
Automatic image matting (AIM) refers to estimating the soft foreground from an arbitrary natural image without any auxiliary input like trimap.
We propose a novel end-to-end matting network, which can predict a generalized trimap for any image of the above types as a unified semantic representation.
Our network trained on available composite matting datasets outperforms existing methods both objectively and subjectively.
arXiv Detail & Related papers (2021-07-15T10:29:01Z) - Improved Image Matting via Real-time User Clicks and Uncertainty
Estimation [87.84632514927098]
This paper proposes an improved deep image matting framework which is trimap-free and only needs several user click interactions to eliminate the ambiguity.
We introduce a new uncertainty estimation module that can predict which parts need polishing and a following local refinement module.
Results show that our method performs better than existing trimap-free methods and comparably to state-of-the-art trimap-based methods with minimal user effort.
arXiv Detail & Related papers (2020-12-15T14:32:36Z) - Bridging Composite and Real: Towards End-to-end Deep Image Matting [88.79857806542006]
We study the roles of semantics and details for image matting.
We propose a novel Glance and Focus Matting network (GFM), which employs a shared encoder and two separate decoders.
Comprehensive empirical studies have demonstrated that GFM outperforms state-of-the-art methods.
arXiv Detail & Related papers (2020-10-30T10:57:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.