End-to-end One-shot Human Parsing
- URL: http://arxiv.org/abs/2105.01241v3
- Date: Sun, 24 Sep 2023 05:03:21 GMT
- Title: End-to-end One-shot Human Parsing
- Authors: Haoyu He, Bohan Zhuang, Jing Zhang, Jianfei Cai, Dacheng Tao
- Abstract summary: One-shot human parsing (OSHP) task requires parsing humans into an open set of classes defined by any test example.
End-to-end One-shot human Parsing Network (EOP-Net) proposed.
EOP-Net outperforms representative one-shot segmentation models by large margins.
- Score: 91.5113227694443
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Previous human parsing methods are limited to parsing humans into pre-defined
classes, which is inflexible for practical fashion applications that often have
new fashion item classes. In this paper, we define a novel one-shot human
parsing (OSHP) task that requires parsing humans into an open set of classes
defined by any test example. During training, only base classes are exposed,
which only overlap with part of the test-time classes. To address three main
challenges in OSHP, i.e., small sizes, testing bias, and similar parts, we
devise an End-to-end One-shot human Parsing Network (EOP-Net). Firstly, an
end-to-end human parsing framework is proposed to parse the query image into
both coarse-grained and fine-grained human classes, which embeds rich semantic
information that is shared across different granularities to identify the
small-sized human classes. Then, we gradually smooth the training-time static
prototypes to get robust class representations. Moreover, we employ a dynamic
objective to encourage the network's enhancing features' representational
capability in the early training phase while improving features'
transferability in the late training phase. Therefore, our method can quickly
adapt to the novel classes and mitigate the testing bias issue. In addition, we
add a contrastive loss at the prototype level to enforce inter-class distances,
thereby discriminating the similar parts. For comprehensive evaluations on the
new task, we tailor three existing popular human parsing benchmarks to the OSHP
task. Experiments demonstrate that EOP-Net outperforms representative one-shot
segmentation models by large margins and serves as a strong baseline for
further research. The source code is available at
https://github.com/Charleshhy/One-shot-Human-Parsing.
Related papers
- Harmonizing Base and Novel Classes: A Class-Contrastive Approach for
Generalized Few-Shot Segmentation [78.74340676536441]
We propose a class contrastive loss and a class relationship loss to regulate prototype updates and encourage a large distance between prototypes.
Our proposed approach achieves new state-of-the-art performance for the generalized few-shot segmentation task on PASCAL VOC and MS COCO datasets.
arXiv Detail & Related papers (2023-03-24T00:30:25Z) - Dual Prototypical Contrastive Learning for Few-shot Semantic
Segmentation [55.339405417090084]
We propose a dual prototypical contrastive learning approach tailored to the few-shot semantic segmentation (FSS) task.
The main idea is to encourage the prototypes more discriminative by increasing inter-class distance while reducing intra-class distance in prototype feature space.
We demonstrate that the proposed dual contrastive learning approach outperforms state-of-the-art FSS methods on PASCAL-5i and COCO-20i datasets.
arXiv Detail & Related papers (2021-11-09T08:14:50Z) - Differentiable Multi-Granularity Human Representation Learning for
Instance-Aware Human Semantic Parsing [131.97475877877608]
A new bottom-up regime is proposed to learn category-level human semantic segmentation and multi-person pose estimation in a joint and end-to-end manner.
It is a compact, efficient and powerful framework that exploits structural information over different human granularities.
Experiments on three instance-aware human datasets show that our model outperforms other bottom-up alternatives with much more efficient inference.
arXiv Detail & Related papers (2021-03-08T06:55:00Z) - Nondiscriminatory Treatment: a straightforward framework for multi-human
parsing [14.254424142949741]
Multi-human parsing aims to segment every body part of every human instance.
We present an end-to-end and box-free pipeline from a new and more human-intuitive perspective.
Experiments show that our network performs superiorly against state-of-the-art methods.
arXiv Detail & Related papers (2021-01-26T16:31:21Z) - Few-shot Action Recognition with Prototype-centered Attentive Learning [88.10852114988829]
Prototype-centered Attentive Learning (PAL) model composed of two novel components.
First, a prototype-centered contrastive learning loss is introduced to complement the conventional query-centered learning objective.
Second, PAL integrates a attentive hybrid learning mechanism that can minimize the negative impacts of outliers.
arXiv Detail & Related papers (2021-01-20T11:48:12Z) - Adaptive Prototypical Networks with Label Words and Joint Representation
Learning for Few-Shot Relation Classification [17.237331828747006]
This work focuses on few-shot relation classification (FSRC)
We propose an adaptive mixture mechanism to add label words to the representation of the class prototype.
Experiments have been conducted on FewRel under different few-shot (FS) settings.
arXiv Detail & Related papers (2021-01-10T11:25:42Z) - Progressive One-shot Human Parsing [75.18661230253558]
We propose a new problem named one-shot human parsing (OSHP)
OSHP requires to parse human into an open set of reference classes defined by any single reference example.
In this paper, we devise a novel Progressive One-shot Parsing network (POPNet) to address two critical challenges.
arXiv Detail & Related papers (2020-12-22T03:06:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.