Deep Multi-task Multi-label CNN for Effective Facial Attribute
Classification
- URL: http://arxiv.org/abs/2002.03683v1
- Date: Mon, 10 Feb 2020 12:34:16 GMT
- Title: Deep Multi-task Multi-label CNN for Effective Facial Attribute
Classification
- Authors: Longbiao Mao, Yan Yan, Jing-Hao Xue, and Hanzi Wang
- Abstract summary: We propose a novel deep multi-task multi-label CNN, termed DMM-CNN, for effective Facial Attribute Classification (FAC)
Specifically, DMM-CNN jointly optimize two closely-related tasks (i.e., facial landmark detection and FAC) to improve the performance of FAC by taking advantage of multi-task learning.
Two different network architectures are respectively designed to extract features for two groups of attributes, and a novel dynamic weighting scheme is proposed to automatically assign the loss weight to each facial attribute during training.
- Score: 53.58763562421771
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Facial Attribute Classification (FAC) has attracted increasing attention in
computer vision and pattern recognition. However, state-of-the-art FAC methods
perform face detection/alignment and FAC independently. The inherent
dependencies between these tasks are not fully exploited. In addition, most
methods predict all facial attributes using the same CNN network architecture,
which ignores the different learning complexities of facial attributes. To
address the above problems, we propose a novel deep multi-task multi-label CNN,
termed DMM-CNN, for effective FAC. Specifically, DMM-CNN jointly optimizes two
closely-related tasks (i.e., facial landmark detection and FAC) to improve the
performance of FAC by taking advantage of multi-task learning. To deal with the
diverse learning complexities of facial attributes, we divide the attributes
into two groups: objective attributes and subjective attributes. Two different
network architectures are respectively designed to extract features for two
groups of attributes, and a novel dynamic weighting scheme is proposed to
automatically assign the loss weight to each facial attribute during training.
Furthermore, an adaptive thresholding strategy is developed to effectively
alleviate the problem of class imbalance for multi-label learning. Experimental
results on the challenging CelebA and LFWA datasets show the superiority of the
proposed DMM-CNN method compared with several state-of-the-art FAC methods.
Related papers
- SwinFace: A Multi-task Transformer for Face Recognition, Expression
Recognition, Age Estimation and Attribute Estimation [60.94239810407917]
This paper presents a multi-purpose algorithm for simultaneous face recognition, facial expression recognition, age estimation, and face attribute estimation based on a single Swin Transformer.
To address the conflicts among multiple tasks, a Multi-Level Channel Attention (MLCA) module is integrated into each task-specific analysis.
Experiments show that the proposed model has a better understanding of the face and achieves excellent performance for all tasks.
arXiv Detail & Related papers (2023-08-22T15:38:39Z) - PARFormer: Transformer-based Multi-Task Network for Pedestrian Attribute
Recognition [23.814762073093153]
We propose a pure transformer-based multi-task PAR network named PARFormer, which includes four modules.
In the feature extraction module, we build a strong baseline for feature extraction, which achieves competitive results on several PAR benchmarks.
In the viewpoint perception module, we explore the impact of viewpoints on pedestrian attributes, and propose a multi-view contrastive loss.
In the attribute recognition module, we alleviate the negative-positive imbalance problem to generate the attribute predictions.
arXiv Detail & Related papers (2023-04-14T16:27:56Z) - TransFA: Transformer-based Representation for Face Attribute Evaluation [87.09529826340304]
We propose a novel textbftransformer-based representation for textbfattribute evaluation method (textbfTransFA)
The proposed TransFA achieves superior performances compared with state-of-the-art methods.
arXiv Detail & Related papers (2022-07-12T10:58:06Z) - Universal Representations: A Unified Look at Multiple Task and Domain
Learning [37.27708297562079]
We propose a unified look at jointly learning multiple vision tasks and visual domains through universal representations.
We show that universal representations achieve state-of-the-art performances in learning of multiple dense prediction problems.
We also conduct multiple analysis through ablation and qualitative studies.
arXiv Detail & Related papers (2022-04-06T11:40:01Z) - Tasks Structure Regularization in Multi-Task Learning for Improving
Facial Attribute Prediction [27.508755548317712]
We use a new Multi-Task Learning (MTL) paradigm in which a facial attribute predictor uses the knowledge of other related attributes to obtain a better generalization performance.
Our MTL methods are compared with competing methods for facial attribute prediction to show its effectiveness.
arXiv Detail & Related papers (2021-07-29T08:38:17Z) - Exploiting Emotional Dependencies with Graph Convolutional Networks for
Facial Expression Recognition [31.40575057347465]
This paper proposes a novel multi-task learning framework to recognize facial expressions in-the-wild.
A shared feature representation is learned for both discrete and continuous recognition in a MTL setting.
The results of our experiments show that our method outperforms the current state-of-the-art methods on discrete FER.
arXiv Detail & Related papers (2021-06-07T10:20:05Z) - Semantic Change Detection with Asymmetric Siamese Networks [71.28665116793138]
Given two aerial images, semantic change detection aims to locate the land-cover variations and identify their change types with pixel-wise boundaries.
This problem is vital in many earth vision related tasks, such as precise urban planning and natural resource management.
We present an asymmetric siamese network (ASN) to locate and identify semantic changes through feature pairs obtained from modules of widely different structures.
arXiv Detail & Related papers (2020-10-12T13:26:30Z) - Domain Private and Agnostic Feature for Modality Adaptive Face
Recognition [10.497190559654245]
This paper proposes a Feature Aggregation Network (FAN), which includes disentangled representation module (DRM), feature fusion module (FFM) and metric penalty learning session.
First, in DRM, twoworks, i.e. domain-private network and domain-agnostic network are specially designed for learning modality features and identity features.
Second, in FFM, the identity features are fused with domain features to achieve cross-modal bi-directional identity feature transformation.
Third, considering that the distribution imbalance between easy and hard pairs exists in cross-modal datasets, the identity preserving guided metric learning with adaptive
arXiv Detail & Related papers (2020-08-10T00:59:42Z) - Symbiotic Adversarial Learning for Attribute-based Person Search [86.7506832053208]
We present a symbiotic adversarial learning framework, called SAL.Two GANs sit at the base of the framework in a symbiotic learning scheme.
Specifically, two different types of generative adversarial networks learn collaboratively throughout the training process.
arXiv Detail & Related papers (2020-07-19T07:24:45Z) - Cross-modality Person re-identification with Shared-Specific Feature
Transfer [112.60513494602337]
Cross-modality person re-identification (cm-ReID) is a challenging but key technology for intelligent video analysis.
We propose a novel cross-modality shared-specific feature transfer algorithm (termed cm-SSFT) to explore the potential of both the modality-shared information and the modality-specific characteristics.
arXiv Detail & Related papers (2020-02-28T00:18:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.