Learn-to-Decompose: Cascaded Decomposition Network for Cross-Domain
Few-Shot Facial Expression Recognition
- URL: http://arxiv.org/abs/2207.07973v1
- Date: Sat, 16 Jul 2022 16:10:28 GMT
- Title: Learn-to-Decompose: Cascaded Decomposition Network for Cross-Domain
Few-Shot Facial Expression Recognition
- Authors: Xinyi Zou, Yan Yan, Jing-Hao Xue, Si Chen, Hanzi Wang
- Abstract summary: We propose a novel cascaded decomposition network (CDNet) for compound facial expression recognition.
By training across similar tasks on basic expression datasets, CDNet learns the ability of learn-to-decompose that can be easily adapted to identify unseen compound expressions.
- Score: 60.51225419301642
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Most existing compound facial expression recognition (FER) methods rely on
large-scale labeled compound expression data for training. However, collecting
such data is labor-intensive and time-consuming. In this paper, we address the
compound FER task in the cross-domain few-shot learning (FSL) setting, which
requires only a few samples of compound expressions in the target domain.
Specifically, we propose a novel cascaded decomposition network (CDNet), which
cascades several learn-to-decompose modules with shared parameters based on a
sequential decomposition mechanism, to obtain a transferable feature space. To
alleviate the overfitting problem caused by limited base classes in our task, a
partial regularization strategy is designed to effectively exploit the best of
both episodic training and batch training. By training across similar tasks on
multiple basic expression datasets, CDNet learns the ability of
learn-to-decompose that can be easily adapted to identify unseen compound
expressions. Extensive experiments on both in-the-lab and in-the-wild compound
expression datasets demonstrate the superiority of our proposed CDNet against
several state-of-the-art FSL methods. Code is available at:
https://github.com/zouxinyi0625/CDNet.
Related papers
- PPN: Parallel Pointer-based Network for Key Information Extraction with
Complex Layouts [29.73609439825548]
Key Information Extraction is a challenging task that aims to extract structured value semantic entities from documents.
Existing methods follow a two-stage pipeline strategy, which may lead to the error propagation problem.
We introduce Parallel Pointer-based Network (PPN), an end-to-end model that can be applied in zero-shot and few-shot scenarios.
arXiv Detail & Related papers (2023-07-20T03:29:09Z) - Single-Stage Visual Relationship Learning using Conditional Queries [60.90880759475021]
TraCQ is a new formulation for scene graph generation that avoids the multi-task learning problem and the entity pair distribution.
We employ a DETR-based encoder-decoder conditional queries to significantly reduce the entity label space as well.
Experimental results show that TraCQ not only outperforms existing single-stage scene graph generation methods, it also beats many state-of-the-art two-stage methods on the Visual Genome dataset.
arXiv Detail & Related papers (2023-06-09T06:02:01Z) - Compositional Semantic Parsing with Large Language Models [27.627684573915147]
We identify challenges in more realistic semantic parsing tasks with larger vocabulary.
Our best method is based on least-to-most prompting.
We expect similar efforts will lead to new results in other tasks and domains.
arXiv Detail & Related papers (2022-09-29T17:58:28Z) - On the Soft-Subnetwork for Few-shot Class Incremental Learning [67.0373924836107]
We propose a few-shot class incremental learning (FSCIL) method referred to as emphSoft-SubNetworks (SoftNet).
Our objective is to learn a sequence of sessions incrementally, where each session only includes a few training instances per class while preserving the knowledge of the previously learned ones.
We provide comprehensive empirical validations demonstrating that our SoftNet effectively tackles the few-shot incremental learning problem by surpassing the performance of state-of-the-art baselines over benchmark datasets.
arXiv Detail & Related papers (2022-09-15T04:54:02Z) - CTL-MTNet: A Novel CapsNet and Transfer Learning-Based Mixed Task Net
for the Single-Corpus and Cross-Corpus Speech Emotion Recognition [15.098532236157556]
Speech Emotion Recognition (SER) has become a growing focus of research in human-computer interaction.
To address this challenge, a Capsule Network (CapsNet) and Transfer Learning based Mixed Task Net (CTLMTNet) are proposed to deal with both the singlecorpus and cross-corpus SER tasks simultaneously.
The results indicate that in both tasks the CTL-MTNet showed better performance in all cases compared to a number of state-of-the-art methods.
arXiv Detail & Related papers (2022-07-18T09:09:23Z) - HRKD: Hierarchical Relational Knowledge Distillation for Cross-domain
Language Model Compression [53.90578309960526]
Large pre-trained language models (PLMs) have shown overwhelming performances compared with traditional neural network methods.
We propose a hierarchical relational knowledge distillation (HRKD) method to capture both hierarchical and domain relational information.
arXiv Detail & Related papers (2021-10-16T11:23:02Z) - X2Parser: Cross-Lingual and Cross-Domain Framework for Task-Oriented
Compositional Semantic Parsing [51.81533991497547]
Task-oriented compositional semantic parsing (TCSP) handles complex nested user queries.
We present X2 compared a transferable Cross-lingual and Cross-domain for TCSP.
We propose to predict flattened intents and slots representations separately and cast both prediction tasks into sequence labeling problems.
arXiv Detail & Related papers (2021-06-07T16:40:05Z) - Few-shot learning via tensor hallucination [17.381648488344222]
Few-shot classification addresses the challenge of classifying examples given only limited labeled data.
We show that using a simple loss function is more than enough for training a feature generator in the few-shot setting.
Our method sets a new state of the art, outperforming more sophisticated few-shot data augmentation methods.
arXiv Detail & Related papers (2021-04-19T17:30:33Z) - Separable Batch Normalization for Robust Facial Landmark Localization
with Cross-protocol Network Training [41.82379935715916]
A big, diverse and balanced training data is the key to the success of deep neural network training.
A small dataset without diverse and balanced training samples cannot support the training of a deep network effectively.
This paper presents a novel Separable Batch Normalization (SepBN) module with a Cross-protocol Network Training (CNT) strategy for robust facial landmark localization.
arXiv Detail & Related papers (2021-01-17T13:04:06Z) - Searching Central Difference Convolutional Networks for Face
Anti-Spoofing [68.77468465774267]
Face anti-spoofing (FAS) plays a vital role in face recognition systems.
Most state-of-the-art FAS methods rely on stacked convolutions and expert-designed network.
Here we propose a novel frame level FAS method based on Central Difference Convolution (CDC)
arXiv Detail & Related papers (2020-03-09T12:48:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.