Decoupled Multi-task Learning with Cyclical Self-Regulation for Face
Parsing
- URL: http://arxiv.org/abs/2203.14448v1
- Date: Mon, 28 Mar 2022 02:12:30 GMT
- Title: Decoupled Multi-task Learning with Cyclical Self-Regulation for Face
Parsing
- Authors: Qingping Zheng, Jiankang Deng, Zheng Zhu, Ying Li, Stefanos Zafeiriou
- Abstract summary: We propose a novel Decoupled Multi-task Learning with Cyclical Self-Regulation for face parsing.
Specifically, DML-CSR designs a multi-task model which comprises face parsing, binary edge, and category edge detection.
Our method achieves the new state-of-the-art performance on the Helen, CelebA-HQ, and LapaMask datasets.
- Score: 71.19528222206088
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper probes intrinsic factors behind typical failure cases (e.g.
spatial inconsistency and boundary confusion) produced by the existing
state-of-the-art method in face parsing. To tackle these problems, we propose a
novel Decoupled Multi-task Learning with Cyclical Self-Regulation (DML-CSR) for
face parsing. Specifically, DML-CSR designs a multi-task model which comprises
face parsing, binary edge, and category edge detection. These tasks only share
low-level encoder weights without high-level interactions between each other,
enabling to decouple auxiliary modules from the whole network at the inference
stage. To address spatial inconsistency, we develop a dynamic dual graph
convolutional network to capture global contextual information without using
any extra pooling operation. To handle boundary confusion in both single and
multiple face scenarios, we exploit binary and category edge detection to
jointly obtain generic geometric structure and fine-grained semantic clues of
human faces. Besides, to prevent noisy labels from degrading model
generalization during training, cyclical self-regulation is proposed to
self-ensemble several model instances to get a new model and the resulting
model then is used to self-distill subsequent models, through alternating
iterations. Experiments show that our method achieves the new state-of-the-art
performance on the Helen, CelebAMask-HQ, and Lapa datasets. The source code is
available at
https://github.com/deepinsight/insightface/tree/master/parsing/dml_csr.
Related papers
- Truncated Consistency Models [57.50243901368328]
Training consistency models requires learning to map all intermediate points along PF ODE trajectories to their corresponding endpoints.
We empirically find that this training paradigm limits the one-step generation performance of consistency models.
We propose a new parameterization of the consistency function and a two-stage training procedure that prevents the truncated-time training from collapsing to a trivial solution.
arXiv Detail & Related papers (2024-10-18T22:38:08Z) - Comprehensive Generative Replay for Task-Incremental Segmentation with Concurrent Appearance and Semantic Forgetting [49.87694319431288]
Generalist segmentation models are increasingly favored for diverse tasks involving various objects from different image sources.
We propose a Comprehensive Generative (CGR) framework that restores appearance and semantic knowledge by synthesizing image-mask pairs.
Experiments on incremental tasks (cardiac, fundus and prostate segmentation) show its clear advantage for alleviating concurrent appearance and semantic forgetting.
arXiv Detail & Related papers (2024-06-28T10:05:58Z) - Scale Attention for Learning Deep Face Representation: A Study Against
Visual Scale Variation [69.45176408639483]
We reform the conv layer by resorting to the scale-space theory.
We build a novel style named SCale AttentioN Conv Neural Network (textbfSCAN-CNN)
As a single-shot scheme, the inference is more efficient than multi-shot fusion.
arXiv Detail & Related papers (2022-09-19T06:35:04Z) - Semi-Supervised Manifold Learning with Complexity Decoupled Chart Autoencoders [45.29194877564103]
This work introduces a chart autoencoder with an asymmetric encoding-decoding process that can incorporate additional semi-supervised information such as class labels.
We discuss the approximation power of such networks and derive a bound that essentially depends on the intrinsic dimension of the data manifold rather than the dimension of ambient space.
arXiv Detail & Related papers (2022-08-22T19:58:03Z) - Dynamic Prototype Mask for Occluded Person Re-Identification [88.7782299372656]
Existing methods mainly address this issue by employing body clues provided by an extra network to distinguish the visible part.
We propose a novel Dynamic Prototype Mask (DPM) based on two self-evident prior knowledge.
Under this condition, the occluded representation could be well aligned in a selected subspace spontaneously.
arXiv Detail & Related papers (2022-07-19T03:31:13Z) - Pre-Trained Models for Heterogeneous Information Networks [57.78194356302626]
We propose a self-supervised pre-training and fine-tuning framework, PF-HIN, to capture the features of a heterogeneous information network.
PF-HIN consistently and significantly outperforms state-of-the-art alternatives on each of these tasks, on four datasets.
arXiv Detail & Related papers (2020-07-07T03:36:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.