Optimal Transport-based Identity Matching for Identity-invariant Facial
Expression Recognition
- URL: http://arxiv.org/abs/2209.12172v1
- Date: Sun, 25 Sep 2022 07:30:44 GMT
- Title: Optimal Transport-based Identity Matching for Identity-invariant Facial
Expression Recognition
- Authors: Daeha Kim and Byung Cheol Song
- Abstract summary: Identity-invariant facial expression recognition (FER) has been one of the challenging computer vision tasks.
This paper proposes to quantify the inter-identity variation by utilizing pairs of similar expressions explored through a specific matching process.
The proposed matching method is not only easy to plug in to other models, but also requires only acceptable computational overhead.
- Score: 33.072870202596725
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Identity-invariant facial expression recognition (FER) has been one of the
challenging computer vision tasks. Since conventional FER schemes do not
explicitly address the inter-identity variation of facial expressions, their
neural network models still operate depending on facial identity. This paper
proposes to quantify the inter-identity variation by utilizing pairs of similar
expressions explored through a specific matching process. We formulate the
identity matching process as an Optimal Transport (OT) problem. Specifically,
to find pairs of similar expressions from different identities, we define the
inter-feature similarity as a transportation cost. Then, optimal identity
matching to find the optimal flow with minimum transportation cost is performed
by Sinkhorn-Knopp iteration. The proposed matching method is not only easy to
plug in to other models, but also requires only acceptable computational
overhead. Extensive simulations prove that the proposed FER method improves the
PCC/CCC performance by up to 10\% or more compared to the runner-up on wild
datasets. The source code and software demo are available at
https://github.com/kdhht2334/ELIM_FER.
Related papers
- The Balanced-Pairwise-Affinities Feature Transform [2.3020018305241337]
TheBPA feature transform is designed to upgrade the features of a set of input items to facilitate downstream matching or grouping related tasks.
A particular min-cost-max-flow fractional matching problem leads to a transform which is efficient, differentiable, equivariant, parameterless and probabilistically interpretable.
Empirically, the transform is highly effective and flexible in its use and consistently improves networks it is inserted into, in a variety of tasks and training schemes.
arXiv Detail & Related papers (2024-06-25T14:28:05Z) - SwinFace: A Multi-task Transformer for Face Recognition, Expression
Recognition, Age Estimation and Attribute Estimation [60.94239810407917]
This paper presents a multi-purpose algorithm for simultaneous face recognition, facial expression recognition, age estimation, and face attribute estimation based on a single Swin Transformer.
To address the conflicts among multiple tasks, a Multi-Level Channel Attention (MLCA) module is integrated into each task-specific analysis.
Experiments show that the proposed model has a better understanding of the face and achieves excellent performance for all tasks.
arXiv Detail & Related papers (2023-08-22T15:38:39Z) - Multi-Domain Norm-referenced Encoding Enables Data Efficient Transfer
Learning of Facial Expression Recognition [62.997667081978825]
We propose a biologically-inspired mechanism for transfer learning in facial expression recognition.
Our proposed architecture provides an explanation for how the human brain might innately recognize facial expressions on varying head shapes.
Our model achieves a classification accuracy of 92.15% on the FERG dataset with extreme data efficiency.
arXiv Detail & Related papers (2023-04-05T09:06:30Z) - SIM2E: Benchmarking the Group Equivariant Capability of Correspondence
Matching Algorithms [12.892976023503818]
This paper presents a specialized dataset dedicated to evaluating sim(2)-equivariant correspondence matching algorithms.
We compare the performance of 16 state-of-the-art (SoTA) correspondence matching approaches.
Since the subpixel accuracy achieved by CNN-based correspondence matching approaches is unsatisfactory, this specific area requires more attention in future works.
arXiv Detail & Related papers (2022-08-21T14:47:02Z) - Disentangling Identity and Pose for Facial Expression Recognition [54.50747989860957]
We propose an identity and pose disentangled facial expression recognition (IPD-FER) model to learn more discriminative feature representation.
For identity encoder, a well pre-trained face recognition model is utilized and fixed during training, which alleviates the restriction on specific expression training data.
By comparing the difference between synthesized neutral and expressional images of the same individual, the expression component is further disentangled from identity and pose.
arXiv Detail & Related papers (2022-08-17T06:48:13Z) - Dynamic Prototype Mask for Occluded Person Re-Identification [88.7782299372656]
Existing methods mainly address this issue by employing body clues provided by an extra network to distinguish the visible part.
We propose a novel Dynamic Prototype Mask (DPM) based on two self-evident prior knowledge.
Under this condition, the occluded representation could be well aligned in a selected subspace spontaneously.
arXiv Detail & Related papers (2022-07-19T03:31:13Z) - Learning Fair Face Representation With Progressive Cross Transformer [79.73754444296213]
We propose a progressive cross transformer (PCT) method for fair face recognition.
We show that PCT is capable of mitigating bias in face recognition while achieving state-of-the-art FR performance.
arXiv Detail & Related papers (2021-08-11T01:31:14Z) - AOT: Appearance Optimal Transport Based Identity Swapping for Forgery
Detection [76.7063732501752]
We provide a new identity swapping algorithm with large differences in appearance for face forgery detection.
The appearance gaps mainly arise from the large discrepancies in illuminations and skin colors.
A discriminator is introduced to distinguish the fake parts from a mix of real and fake image patches.
arXiv Detail & Related papers (2020-11-05T06:17:04Z) - Mutual Information Regularized Identity-aware Facial
ExpressionRecognition in Compressed Video [27.602648102881535]
We propose a novel collaborative min-min game for mutual information (MI) minimization in latent space.
We do not need the identity label or multiple expression samples from the same person for identity elimination.
Our solution can achieve comparable or better performance than the recent decoded image-based methods.
arXiv Detail & Related papers (2020-10-20T21:42:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.