Dive into Ambiguity: Latent Distribution Mining and Pairwise Uncertainty
Estimation for Facial Expression Recognition
- URL: http://arxiv.org/abs/2104.00232v1
- Date: Thu, 1 Apr 2021 03:21:57 GMT
- Title: Dive into Ambiguity: Latent Distribution Mining and Pairwise Uncertainty
Estimation for Facial Expression Recognition
- Authors: Jiahui She, Yibo Hu, Hailin Shi, Jun Wang, Qiu Shen, Tao Mei
- Abstract summary: We propose a solution, named DMUE, to address the problem of annotation ambiguity from two perspectives.
For the former, an auxiliary multi-branch learning framework is introduced to better mine and describe the latent distribution in the label space.
For the latter, the pairwise relationship of semantic feature between instances are fully exploited to estimate the ambiguity extent in the instance space.
- Score: 59.52434325897716
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Due to the subjective annotation and the inherent interclass similarity of
facial expressions, one of key challenges in Facial Expression Recognition
(FER) is the annotation ambiguity. In this paper, we proposes a solution, named
DMUE, to address the problem of annotation ambiguity from two perspectives: the
latent Distribution Mining and the pairwise Uncertainty Estimation. For the
former, an auxiliary multi-branch learning framework is introduced to better
mine and describe the latent distribution in the label space. For the latter,
the pairwise relationship of semantic feature between instances are fully
exploited to estimate the ambiguity extent in the instance space. The proposed
method is independent to the backbone architectures, and brings no extra burden
for inference. The experiments are conducted on the popular real-world
benchmarks and the synthetic noisy datasets. Either way, the proposed DMUE
stably achieves leading performance.
Related papers
- Tackling Ambiguity from Perspective of Uncertainty Inference and Affinity Diversification for Weakly Supervised Semantic Segmentation [12.308473939796945]
Weakly supervised semantic segmentation (WSSS) with image-level labels aims to achieve dense tasks without laborious annotations.
The performance of WSSS, especially the stages of generating Class Activation Maps (CAMs) and refining pseudo masks, widely suffers from ambiguity.
We propose UniA, a unified single-staged WSSS framework, to tackle this issue from the perspective of uncertainty inference and affinity diversification.
arXiv Detail & Related papers (2024-04-12T01:54:59Z) - Domain Generalization with Small Data [27.040070085669086]
We learn a domain-invariant representation based on the probabilistic framework by mapping each data point into probabilistic embeddings.
Our proposed method can marriage the measurement on the textitdistribution over distributions (i.e., the global perspective alignment) and the distribution-based contrastive semantic alignment.
arXiv Detail & Related papers (2024-02-09T02:59:08Z) - Prototype-based Aleatoric Uncertainty Quantification for Cross-modal
Retrieval [139.21955930418815]
Cross-modal Retrieval methods build similarity relations between vision and language modalities by jointly learning a common representation space.
However, the predictions are often unreliable due to the Aleatoric uncertainty, which is induced by low-quality data, e.g., corrupt images, fast-paced videos, and non-detailed texts.
We propose a novel Prototype-based Aleatoric Uncertainty Quantification (PAU) framework to provide trustworthy predictions by quantifying the uncertainty arisen from the inherent data ambiguity.
arXiv Detail & Related papers (2023-09-29T09:41:19Z) - Uncertain Facial Expression Recognition via Multi-task Assisted
Correction [43.02119884581332]
We propose a novel method of multi-task assisted correction in addressing uncertain facial expression recognition called MTAC.
Specifically, a confidence estimation block and a weighted regularization module are applied to highlight solid samples and suppress uncertain samples in every batch.
Experiments on RAF-DB, AffectNet, and AffWild2 datasets demonstrate that the MTAC obtains substantial improvements over baselines when facing synthetic and real uncertainties.
arXiv Detail & Related papers (2022-12-14T10:28:08Z) - On the Fundamental Trade-offs in Learning Invariant Representations [7.868449549351487]
We identify and determine two fundamental trade-offs between utility and semantic dependence induced by the statistical dependencies between the data and its corresponding target and semantic attributes.
We numerically quantify the trade-offs on representative problems and compare to the solutions achieved by baseline representation learning algorithms.
arXiv Detail & Related papers (2021-09-08T01:26:46Z) - Exploring Robustness of Unsupervised Domain Adaptation in Semantic
Segmentation [74.05906222376608]
We propose adversarial self-supervision UDA (or ASSUDA) that maximizes the agreement between clean images and their adversarial examples by a contrastive loss in the output space.
This paper is rooted in two observations: (i) the robustness of UDA methods in semantic segmentation remains unexplored, which pose a security concern in this field; and (ii) although commonly used self-supervision (e.g., rotation and jigsaw) benefits image tasks such as classification and recognition, they fail to provide the critical supervision signals that could learn discriminative representation for segmentation tasks.
arXiv Detail & Related papers (2021-05-23T01:50:44Z) - Inter-class Discrepancy Alignment for Face Recognition [55.578063356210144]
We propose a unified framework calledInter-class DiscrepancyAlignment(IDA)
IDA-DAO is used to align the similarity scores considering the discrepancy between the images and its neighbors.
IDA-SSE can provide convincing inter-class neighbors by introducing virtual candidate images generated with GAN.
arXiv Detail & Related papers (2021-03-02T08:20:08Z) - Evaluating Disentanglement of Structured Latent Representations [3.756550107432323]
We design the first multi-layer disentanglement metric operating at all hierarchy levels of a structured latent representation.
Our metric unifies the evaluation of both object separation between latent slots and internal slot disentanglement into a common mathematical framework.
arXiv Detail & Related papers (2021-01-11T17:24:01Z) - Exploiting Sample Uncertainty for Domain Adaptive Person
Re-Identification [137.9939571408506]
We estimate and exploit the credibility of the assigned pseudo-label of each sample to alleviate the influence of noisy labels.
Our uncertainty-guided optimization brings significant improvement and achieves the state-of-the-art performance on benchmark datasets.
arXiv Detail & Related papers (2020-12-16T04:09:04Z) - Learning Disentangled Representations with Latent Variation
Predictability [102.4163768995288]
This paper defines the variation predictability of latent disentangled representations.
Within an adversarial generation process, we encourage variation predictability by maximizing the mutual information between latent variations and corresponding image pairs.
We develop an evaluation metric that does not rely on the ground-truth generative factors to measure the disentanglement of latent representations.
arXiv Detail & Related papers (2020-07-25T08:54:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.