ASD: Towards Attribute Spatial Decomposition for Prior-Free Facial
Attribute Recognition
- URL: http://arxiv.org/abs/2210.13716v1
- Date: Tue, 25 Oct 2022 02:25:05 GMT
- Title: ASD: Towards Attribute Spatial Decomposition for Prior-Free Facial
Attribute Recognition
- Authors: Chuanfei Hu, Hang Shao, Bo Dong, Zhe Wang and Yongxiong Wang
- Abstract summary: Representing the spatial properties of facial attributes is a vital challenge for facial attribute recognition (FAR)
Recent advances have achieved the reliable performances for FAR, benefiting from the description of spatial properties via extra prior information.
We propose a prior-free method for attribute spatial decomposition (ASD), mitigating the spatial ambiguity of facial attributes without any extra prior information.
- Score: 11.757112726108822
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Representing the spatial properties of facial attributes is a vital challenge
for facial attribute recognition (FAR). Recent advances have achieved the
reliable performances for FAR, benefiting from the description of spatial
properties via extra prior information. However, the extra prior information
might not be always available, resulting in the restricted application scenario
of the prior-based methods. Meanwhile, the spatial ambiguity of facial
attributes caused by inherent spatial diversities of facial parts is ignored.
To address these issues, we propose a prior-free method for attribute spatial
decomposition (ASD), mitigating the spatial ambiguity of facial attributes
without any extra prior information. Specifically, assignment-embedding module
(AEM) is proposed to enable the procedure of ASD, which consists of two
operations: attribute-to-location assignment and location-to-attribute
embedding. The attribute-to-location assignment first decomposes the feature
map based on latent factors, assigning the magnitude of attribute components on
each spatial location. Then, the assigned attribute components from all
locations to represent the global-level attribute embeddings. Furthermore,
correlation matrix minimization (CMM) is introduced to enlarge the
discriminability of attribute embeddings. Experimental results demonstrate the
superiority of ASD compared with state-of-the-art prior-based methods, while
the reliable performance of ASD for the case of limited training data is
further validated.
Related papers
- DiffFAE: Advancing High-fidelity One-shot Facial Appearance Editing with Space-sensitive Customization and Semantic Preservation [84.0586749616249]
This paper presents DiffFAE, a one-stage and highly-efficient diffusion-based framework tailored for high-fidelity Facial Appearance Editing.
For high-fidelity query attributes transfer, we adopt Space-sensitive Physical Customization (SPC), which ensures the fidelity and generalization ability.
In order to preserve source attributes, we introduce the Region-responsive Semantic Composition (RSC)
This module is guided to learn decoupled source-regarding features, thereby better preserving the identity and alleviating artifacts from non-facial attributes such as hair, clothes, and background.
arXiv Detail & Related papers (2024-03-26T12:53:10Z) - SSPNet: Scale and Spatial Priors Guided Generalizable and Interpretable
Pedestrian Attribute Recognition [23.55622798950833]
A novel Scale and Spatial Priors Guided Network (SSPNet) is proposed for Pedestrian Attribute Recognition (PAR) models.
SSPNet learns to provide reasonable scale prior information for different attribute groups, allowing the model to focus on different levels of feature maps.
A novel IoU based attribute localization metric is proposed for Weakly-supervised Pedestrian Attribute localization (WPAL) based on the improved Grad-CAM for attribute response mask.
arXiv Detail & Related papers (2023-12-11T00:41:40Z) - Hierarchical Visual Primitive Experts for Compositional Zero-Shot
Learning [52.506434446439776]
Compositional zero-shot learning (CZSL) aims to recognize compositions with prior knowledge of known primitives (attribute and object)
We propose a simple and scalable framework called Composition Transformer (CoT) to address these issues.
Our method achieves SoTA performance on several benchmarks, including MIT-States, C-GQA, and VAW-CZSL.
arXiv Detail & Related papers (2023-08-08T03:24:21Z) - A Solution to Co-occurrence Bias: Attributes Disentanglement via Mutual
Information Minimization for Pedestrian Attribute Recognition [10.821982414387525]
We show that current methods can actually suffer in generalizing such fitted attributes interdependencies onto scenes or identities off the dataset distribution.
To render models robust in realistic scenes, we propose the attributes-disentangled feature learning to ensure the recognition of an attribute not inferring on the existence of others.
arXiv Detail & Related papers (2023-07-28T01:34:55Z) - Multi-Stage Spatio-Temporal Aggregation Transformer for Video Person
Re-identification [78.08536797239893]
We propose a novel Multi-Stage Spatial-Temporal Aggregation Transformer (MSTAT) with two novel designed proxy embedding modules.
MSTAT consists of three stages to encode the attribute-associated, the identity-associated, and the attribute-identity-associated information from the video clips.
We show that MSTAT can achieve state-of-the-art accuracies on various standard benchmarks.
arXiv Detail & Related papers (2023-01-02T05:17:31Z) - TransFA: Transformer-based Representation for Face Attribute Evaluation [87.09529826340304]
We propose a novel textbftransformer-based representation for textbfattribute evaluation method (textbfTransFA)
The proposed TransFA achieves superior performances compared with state-of-the-art methods.
arXiv Detail & Related papers (2022-07-12T10:58:06Z) - Calibrated Feature Decomposition for Generalizable Person
Re-Identification [82.64133819313186]
Calibrated Feature Decomposition (CFD) module focuses on improving the generalization capacity for person re-identification.
A calibrated-and-standardized Batch normalization (CSBN) is designed to learn calibrated person representation.
arXiv Detail & Related papers (2021-11-27T17:12:43Z) - Spatial and Semantic Consistency Regularizations for Pedestrian
Attribute Recognition [50.932864767867365]
We propose a framework that consists of two complementary regularizations to achieve spatial and semantic consistency for each attribute.
Based on the precise attribute locations, we propose a semantic consistency regularization to extract intrinsic and discriminative semantic features.
Results show that the proposed method performs favorably against state-of-the-art methods without increasing parameters.
arXiv Detail & Related papers (2021-09-13T03:36:44Z) - Disentangled Face Attribute Editing via Instance-Aware Latent Space
Search [30.17338705964925]
A rich set of semantic directions exist in the latent space of Generative Adversarial Networks (GANs)
Existing methods may suffer poor attribute variation disentanglement, leading to unwanted change of other attributes when altering the desired one.
We propose a novel framework (IALS) that performs Instance-Aware Latent-Space Search to find semantic directions for disentangled attribute editing.
arXiv Detail & Related papers (2021-05-26T16:19:08Z) - Hierarchical Feature Embedding for Attribute Recognition [26.79901907956084]
We propose a hierarchical feature embedding framework, which learns a fine-grained feature embedding by combining attribute and ID information.
Experiments show that our method achieves the state-of-the-art results on two pedestrian attribute datasets and a facial attribute dataset.
arXiv Detail & Related papers (2020-05-23T17:52:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.