Learning from Label Relationships in Human Affect
- URL: http://arxiv.org/abs/2207.05577v1
- Date: Tue, 12 Jul 2022 15:00:54 GMT
- Title: Learning from Label Relationships in Human Affect
- Authors: Niki Maria Foteinopoulou, Ioannis Patras
- Abstract summary: We introduce a novel relational loss for multilabel regression and ordinal problems that regularises learning and leads to better generalisation.
We evaluate the proposed methodology on both continuous affect and schizophrenia severity estimation problems.
- Score: 13.592112044121683
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human affect and mental state estimation in an automated manner, face a
number of difficulties, including learning from labels with poor or no temporal
resolution, learning from few datasets with little data (often due to
confidentiality constraints) and, (very) long, in-the-wild videos. For these
reasons, deep learning methodologies tend to overfit, that is, arrive at latent
representations with poor generalisation performance on the final regression
task. To overcome this, in this work, we introduce two complementary
contributions. First, we introduce a novel relational loss for multilabel
regression and ordinal problems that regularises learning and leads to better
generalisation. The proposed loss uses label vector inter-relational
information to learn better latent representations by aligning batch label
distances to the distances in the latent feature space. Second, we utilise a
two-stage attention architecture that estimates a target for each clip by using
features from the neighbouring clips as temporal context. We evaluate the
proposed methodology on both continuous affect and schizophrenia severity
estimation problems, as there are methodological and contextual parallels
between the two. Experimental results demonstrate that the proposed methodology
outperforms all baselines. In the domain of schizophrenia, the proposed
methodology outperforms previous state-of-the-art by a large margin, achieving
a PCC of up to 78% performance close to that of human experts (85%) and much
higher than previous works (uplift of up to 40%). In the case of affect
recognition, we outperform previous vision-based methods in terms of CCC on
both the OMG and the AMIGOS datasets. Specifically for AMIGOS, we outperform
previous SoTA CCC for both arousal and valence by 9% and 13% respectively, and
in the OMG dataset we outperform previous vision works by up to 5% for both
arousal and valence.
Related papers
- Human Cognitive Benchmarks Reveal Foundational Visual Gaps in MLLMs [65.93003087656754]
VisFactor is a benchmark that digitizes 20 vision-centric subtests from a well-established cognitive psychology assessment.<n>We evaluate 20 frontier Multimodal Large Language Models (MLLMs) from GPT, Gemini, Claude, LLaMA, Qwen, and SEED families.<n>The best-performing model achieves a score of only 25.19 out of 100, with consistent failures on tasks such as mental rotation, spatial relation inference, and figure-ground discrimination.
arXiv Detail & Related papers (2025-02-23T04:21:32Z) - Memory Consistency Guided Divide-and-Conquer Learning for Generalized
Category Discovery [56.172872410834664]
Generalized category discovery (GCD) aims at addressing a more realistic and challenging setting of semi-supervised learning.
We propose a Memory Consistency guided Divide-and-conquer Learning framework (MCDL)
Our method outperforms state-of-the-art models by a large margin on both seen and unseen classes of the generic image recognition.
arXiv Detail & Related papers (2024-01-24T09:39:45Z) - Estimation of individual causal effects in network setup for multiple
treatments [4.53340898566495]
We study the problem of estimation of Individual Treatment Effects (ITE) in the context of multiple treatments and observational data.
We employ Graph Convolutional Networks (GCN) to learn a shared representation of the confounders.
Our approach utilizes separate neural networks to infer potential outcomes for each treatment.
arXiv Detail & Related papers (2023-12-18T06:07:45Z) - SSL-CPCD: Self-supervised learning with composite pretext-class
discrimination for improved generalisability in endoscopic image analysis [3.1542695050861544]
Deep learning-based supervised methods are widely popular in medical image analysis.
They require a large amount of training data and face issues in generalisability to unseen datasets.
We propose to explore patch-level instance-group discrimination and penalisation of inter-class variation using additive angular margin.
arXiv Detail & Related papers (2023-05-31T21:28:08Z) - Fairness meets Cross-Domain Learning: a new perspective on Models and
Metrics [80.07271410743806]
We study the relationship between cross-domain learning (CD) and model fairness.
We introduce a benchmark on face and medical images spanning several demographic groups as well as classification and localization tasks.
Our study covers 14 CD approaches alongside three state-of-the-art fairness algorithms and shows how the former can outperform the latter.
arXiv Detail & Related papers (2023-03-25T09:34:05Z) - PromptCAL: Contrastive Affinity Learning via Auxiliary Prompts for
Generalized Novel Category Discovery [39.03732147384566]
Generalized Novel Category Discovery (GNCD) setting aims to categorize unlabeled training data coming from known and novel classes.
We propose Contrastive Affinity Learning method with auxiliary visual Prompts, dubbed PromptCAL, to address this challenging problem.
Our approach discovers reliable pairwise sample affinities to learn better semantic clustering of both known and novel classes for the class token and visual prompts.
arXiv Detail & Related papers (2022-12-11T20:06:14Z) - On Feature Learning in the Presence of Spurious Correlations [45.86963293019703]
We show that the quality learned feature representations is greatly affected by the design decisions beyond the method.
We significantly improve upon the best results reported in the literature on the popular Waterbirds, Celeb hair color prediction and WILDS-FMOW problems.
arXiv Detail & Related papers (2022-10-20T16:10:28Z) - Cluster-level pseudo-labelling for source-free cross-domain facial
expression recognition [94.56304526014875]
We propose the first Source-Free Unsupervised Domain Adaptation (SFUDA) method for Facial Expression Recognition (FER)
Our method exploits self-supervised pretraining to learn good feature representations from the target data.
We validate the effectiveness of our method in four adaptation setups, proving that it consistently outperforms existing SFUDA methods when applied to FER.
arXiv Detail & Related papers (2022-10-11T08:24:50Z) - Adversarial Dual-Student with Differentiable Spatial Warping for
Semi-Supervised Semantic Segmentation [70.2166826794421]
We propose a differentiable geometric warping to conduct unsupervised data augmentation.
We also propose a novel adversarial dual-student framework to improve the Mean-Teacher.
Our solution significantly improves the performance and state-of-the-art results are achieved on both datasets.
arXiv Detail & Related papers (2022-03-05T17:36:17Z) - ACP++: Action Co-occurrence Priors for Human-Object Interaction
Detection [102.9428507180728]
A common problem in the task of human-object interaction (HOI) detection is that numerous HOI classes have only a small number of labeled examples.
We observe that there exist natural correlations and anti-correlations among human-object interactions.
We present techniques to learn these priors and leverage them for more effective training, especially on rare classes.
arXiv Detail & Related papers (2021-09-09T06:02:50Z) - MET: Multimodal Perception of Engagement for Telehealth [52.54282887530756]
We present MET, a learning-based algorithm for perceiving a human's level of engagement from videos.
We release a new dataset, MEDICA, for mental health patient engagement detection.
arXiv Detail & Related papers (2020-11-17T15:18:38Z) - Robustly Pre-trained Neural Model for Direct Temporal Relation
Extraction [10.832917897850361]
We studied several variants of BERT (Bidirectional Representations using Transformers)
We evaluated these methods using a direct temporal relations dataset which is a semantically focused subset of the 2012 i2b2 temporal relations challenge dataset.
Results: RoBERTa, which employs better pre-training strategies including using 10x larger corpus, has improved overall F measure by 0.0864 absolute score (on the 1.00 scale) and thus reducing the error rate by 24% relative to the previous state-of-the-art performance achieved with an SVM (support vector machine) model.
arXiv Detail & Related papers (2020-04-13T22:01:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.