Human De-occlusion: Invisible Perception and Recovery for Humans
- URL: http://arxiv.org/abs/2103.11597v1
- Date: Mon, 22 Mar 2021 05:54:58 GMT
- Title: Human De-occlusion: Invisible Perception and Recovery for Humans
- Authors: Qiang Zhou, Shiyin Wang, Yitong Wang, Zilong Huang, Xinggang Wang
- Abstract summary: We tackle the problem of human de-occlusion which reasons about occluded segmentation masks and invisible appearance content of humans.
In particular, a two-stage framework is proposed to estimate the invisible portions and recover the content inside.
Our method performs over the state-of-the-art techniques in both tasks of mask completion and content recovery.
- Score: 26.404444296924243
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In this paper, we tackle the problem of human de-occlusion which reasons
about occluded segmentation masks and invisible appearance content of humans.
In particular, a two-stage framework is proposed to estimate the invisible
portions and recover the content inside. For the stage of mask completion, a
stacked network structure is devised to refine inaccurate masks from a general
instance segmentation model and predict integrated masks simultaneously.
Additionally, the guidance from human parsing and typical pose masks are
leveraged to bring prior information. For the stage of content recovery, a
novel parsing guided attention module is applied to isolate body parts and
capture context information across multiple scales. Besides, an Amodal Human
Perception dataset (AHP) is collected to settle the task of human de-occlusion.
AHP has advantages of providing annotations from real-world scenes and the
number of humans is comparatively larger than other amodal perception datasets.
Based on this dataset, experiments demonstrate that our method performs over
the state-of-the-art techniques in both tasks of mask completion and content
recovery. Our AHP dataset is available at
\url{https://sydney0zq.github.io/ahp/}.
Related papers
- Referring Human Pose and Mask Estimation in the Wild [57.12038065541915]
We introduce Referring Human Pose and Mask Estimation (R-HPM) in the wild.
This task holds significant potential for human-centric applications such as assistive robotics and sports analysis.
We propose the first end-to-end promptable approach named UniPHD for R-HPM.
arXiv Detail & Related papers (2024-10-27T16:44:15Z) - ProFD: Prompt-Guided Feature Disentangling for Occluded Person Re-Identification [34.38227097059117]
We propose a Prompt-guided Feature Disentangling method (ProFD) to generate well-aligned part features.
ProFD first designs part-specific prompts and utilizes noisy segmentation mask to preliminarily align visual and textual embedding.
We employ a self-distillation strategy, retaining pre-trained knowledge of CLIP to mitigate over-fitting.
arXiv Detail & Related papers (2024-09-30T08:31:14Z) - Pluralistic Salient Object Detection [108.74650817891984]
We introduce pluralistic salient object detection (PSOD), a novel task aimed at generating multiple plausible salient segmentation results for a given input image.
We present two new SOD datasets "DUTS-MM" and "DUS-MQ", along with newly designed evaluation metrics.
arXiv Detail & Related papers (2024-09-04T01:38:37Z) - Object-level Scene Deocclusion [92.39886029550286]
We present a new self-supervised PArallel visible-to-COmplete diffusion framework, named PACO, for object-level scene deocclusion.
To train PACO, we create a large-scale dataset with 500k samples to enable self-supervised learning.
Experiments on COCOA and various real-world scenes demonstrate the superior capability of PACO for scene deocclusion, surpassing the state of the arts by a large margin.
arXiv Detail & Related papers (2024-06-11T20:34:10Z) - HAISTA-NET: Human Assisted Instance Segmentation Through Attention [3.073046540587735]
We propose a novel approach to enable more precise predictions and generate higher-quality segmentation masks.
Our human-assisted segmentation model, HAISTA-NET, augments the existing Strong Mask R-CNN network to incorporate human-specified partial boundaries.
We show that HAISTA-NET outperforms state-of-the art methods such as Mask R-CNN, Strong Mask R-CNN, and Mask2Former.
arXiv Detail & Related papers (2023-05-04T18:39:14Z) - Dynamic Prototype Mask for Occluded Person Re-Identification [88.7782299372656]
Existing methods mainly address this issue by employing body clues provided by an extra network to distinguish the visible part.
We propose a novel Dynamic Prototype Mask (DPM) based on two self-evident prior knowledge.
Under this condition, the occluded representation could be well aligned in a selected subspace spontaneously.
arXiv Detail & Related papers (2022-07-19T03:31:13Z) - FaceOcc: A Diverse, High-quality Face Occlusion Dataset for Human Face
Extraction [3.8502825594372703]
Occlusions often occur in face images in the wild, troubling face-related tasks such as landmark detection, 3D reconstruction, and face recognition.
This paper proposes a novel face segmentation dataset with manually labeled face occlusions from the CelebA-HQ and the internet.
We trained a straightforward face segmentation model but obtained SOTA performance, convincingly demonstrating the effectiveness of the proposed dataset.
arXiv Detail & Related papers (2022-01-20T19:44:18Z) - Boosting Semantic Human Matting with Coarse Annotations [66.8725980604434]
coarse annotated human dataset is much easier to acquire and collect from the public dataset.
A matting refinement network takes in the unified mask and the input image to predict the final alpha matte.
arXiv Detail & Related papers (2020-04-10T09:11:02Z) - Self-Supervised Scene De-occlusion [186.89979151728636]
This paper investigates the problem of scene de-occlusion, which aims to recover the underlying occlusion ordering and complete the invisible parts of occluded objects.
We make the first attempt to address the problem through a novel and unified framework that recovers hidden scene structures without ordering and amodal annotations as supervisions.
Based on PCNet-M and PCNet-C, we devise a novel inference scheme to accomplish scene de-occlusion, via progressive ordering recovery, amodal completion and content completion.
arXiv Detail & Related papers (2020-04-06T16:31:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.