Multi-Level Adaptive Region of Interest and Graph Learning for Facial
Action Unit Recognition
- URL: http://arxiv.org/abs/2102.12154v1
- Date: Wed, 24 Feb 2021 09:22:45 GMT
- Title: Multi-Level Adaptive Region of Interest and Graph Learning for Facial
Action Unit Recognition
- Authors: Jingwei Yan, Boyuan Jiang, Jingjing Wang, Qiang Li, Chunmao Wang,
Shiliang Pu
- Abstract summary: We propose a novel multi-level adaptive ROI and graph learning (MARGL) framework to tackle this problem.
In order to incorporate the intra-level AU relation and inter-level AU regional relevance simultaneously, a multi-level AU relation graph is constructed.
Experiments on BP4D and DISFA demonstrate the proposed MARGL significantly outperforms the previous state-of-the-art methods.
- Score: 30.129452080084224
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In facial action unit (AU) recognition tasks, regional feature learning and
AU relation modeling are two effective aspects which are worth exploring.
However, the limited representation capacity of regional features makes it
difficult for relation models to embed AU relationship knowledge. In this
paper, we propose a novel multi-level adaptive ROI and graph learning (MARGL)
framework to tackle this problem. Specifically, an adaptive ROI learning module
is designed to automatically adjust the location and size of the predefined AU
regions. Meanwhile, besides relationship between AUs, there exists strong
relevance between regional features across multiple levels of the backbone
network as level-wise features focus on different aspects of representation. In
order to incorporate the intra-level AU relation and inter-level AU regional
relevance simultaneously, a multi-level AU relation graph is constructed and
graph convolution is performed to further enhance AU regional features of each
level. Experiments on BP4D and DISFA demonstrate the proposed MARGL
significantly outperforms the previous state-of-the-art methods.
Related papers
- Towards End-to-End Explainable Facial Action Unit Recognition via Vision-Language Joint Learning [48.70249675019288]
We propose an end-to-end Vision-Language joint learning network for explainable facial action units (AUs) recognition.
The proposed approach achieves superior performance over the state-of-the-art methods on most metrics.
arXiv Detail & Related papers (2024-08-01T15:35:44Z) - Multi-scale Dynamic and Hierarchical Relationship Modeling for Facial Action Units Recognition [38.62221940006509]
Human facial action units (AUs) are mutually related in a hierarchical manner.
AUs located in the same/close facial regions show stronger relationships than those of different facial regions.
This paper proposes a novel multi-scale AU model for occurrence recognition.
arXiv Detail & Related papers (2024-04-09T16:45:34Z) - Local Region Perception and Relationship Learning Combined with Feature
Fusion for Facial Action Unit Detection [12.677143408225167]
We introduce our submission to the CVPR 2023 Competition on Affective Behavior Analysis in-the-wild (ABAW)
We propose a single-stage trained AU detection framework. Specifically, in order to effectively extract facial local region features related to AU detection, we use a local region perception module.
We also use a graph neural network-based relational learning module to capture the relationship between AUs.
arXiv Detail & Related papers (2023-03-15T11:59:24Z) - MGRR-Net: Multi-level Graph Relational Reasoning Network for Facial Action Units Detection [16.261362598190807]
The Facial Action Coding System (FACS) encodes the action units (AUs) in facial images.
We argue that encoding AU features just from one perspective may not capture the rich contextual information between regional and global face features.
We propose a novel Multi-level Graph Reasoning Network (termed MGRR-Net) for facial AU detection.
arXiv Detail & Related papers (2022-04-04T09:47:22Z) - Weakly Supervised Regional and Temporal Learning for Facial Action Unit
Recognition [36.350407471391065]
We propose two auxiliary AU related tasks to bridge the gap between limited annotations and the model performance.
A single image based optical flow estimation task is proposed to leverage the dynamic change of facial muscles.
By incorporating semi-supervised learning, we propose an end-to-end trainable framework named weakly supervised regional and temporal learning.
arXiv Detail & Related papers (2022-04-01T12:02:01Z) - PRA-Net: Point Relation-Aware Network for 3D Point Cloud Analysis [56.91758845045371]
We propose a novel framework named Point Relation-Aware Network (PRA-Net)
It is composed of an Intra-region Structure Learning (ISL) module and an Inter-region Relation Learning (IRL) module.
Experiments on several 3D benchmarks covering shape classification, keypoint estimation, and part segmentation have verified the effectiveness and the ability of PRA-Net.
arXiv Detail & Related papers (2021-12-09T13:24:43Z) - Self-Supervised Regional and Temporal Auxiliary Tasks for Facial Action
Unit Recognition [29.664359264758495]
We propose two auxiliary AU related tasks to bridge the gap between limited annotations and the model performance.
To enhance the discrimination of regional features with AU relation embedding, we design a task of RoI inpainting to recover the randomly cropped AU patches.
A single image based optical flow estimation task is proposed to leverage the dynamic change of facial muscles.
Based on these two self-supervised auxiliary tasks, local features, mutual relation and motion cues of AUs are better captured in the backbone network.
arXiv Detail & Related papers (2021-07-30T02:39:45Z) - Multi-Level Graph Convolutional Network with Automatic Graph Learning
for Hyperspectral Image Classification [63.56018768401328]
We propose a Multi-level Graph Convolutional Network (GCN) with Automatic Graph Learning method (MGCN-AGL) for HSI classification.
By employing attention mechanism to characterize the importance among spatially neighboring regions, the most relevant information can be adaptively incorporated to make decisions.
Our MGCN-AGL encodes the long range dependencies among image regions based on the expressive representations that have been produced at local level.
arXiv Detail & Related papers (2020-09-19T09:26:20Z) - Adversarial Graph Representation Adaptation for Cross-Domain Facial
Expression Recognition [86.25926461936412]
We propose a novel Adrialversa Graph Representation Adaptation (AGRA) framework that unifies graph representation propagation with adversarial learning for cross-domain holistic-local feature co-adaptation.
We conduct extensive and fair experiments on several popular benchmarks and show that the proposed AGRA framework achieves superior performance over previous state-of-the-art methods.
arXiv Detail & Related papers (2020-08-03T13:27:24Z) - J$\hat{\text{A}}$A-Net: Joint Facial Action Unit Detection and Face
Alignment via Adaptive Attention [57.51255553918323]
We propose a novel end-to-end deep learning framework for joint AU detection and face alignment.
Our framework significantly outperforms the state-of-the-art AU detection methods on the challenging BP4D, DISFA, GFT and BP4D+ benchmarks.
arXiv Detail & Related papers (2020-03-18T12:50:19Z) - Global Context-Aware Progressive Aggregation Network for Salient Object
Detection [117.943116761278]
We propose a novel network named GCPANet to integrate low-level appearance features, high-level semantic features, and global context features.
We show that the proposed approach outperforms the state-of-the-art methods both quantitatively and qualitatively.
arXiv Detail & Related papers (2020-03-02T04:26:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.