Multi-Agent Semi-Siamese Training for Long-tail and Shallow Face
Learning
- URL: http://arxiv.org/abs/2105.04113v1
- Date: Mon, 10 May 2021 04:57:32 GMT
- Title: Multi-Agent Semi-Siamese Training for Long-tail and Shallow Face
Learning
- Authors: Hailin Shi, Dan Zeng, Yichun Tai, Hang Du, Yibo Hu, Tao Mei
- Abstract summary: In many real-world scenarios of face recognition, the depth of training dataset is shallow, which means only two face images are available for each ID.
With the non-uniform increase of samples, such issue is converted to a more general case, a.k.a a long-tail face learning.
Based on the Semi-Siamese Training (SST), we introduce an advanced solution, named Multi-Agent Semi-Siamese Training (MASST)
MASST includes a probe network and multiple gallery agents, the former aims to encode the probe features, and the latter constitutes a stack of
- Score: 54.13876727413492
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the recent development of deep convolutional neural networks and
large-scale datasets, deep face recognition has made remarkable progress and
been widely used in various applications. However, unlike the existing public
face datasets, in many real-world scenarios of face recognition, the depth of
training dataset is shallow, which means only two face images are available for
each ID. With the non-uniform increase of samples, such issue is converted to a
more general case, a.k.a long-tail face learning, which suffers from data
imbalance and intra-class diversity dearth simultaneously. These adverse
conditions damage the training and result in the decline of model performance.
Based on the Semi-Siamese Training (SST), we introduce an advanced solution,
named Multi-Agent Semi-Siamese Training (MASST), to address these problems.
MASST includes a probe network and multiple gallery agents, the former aims to
encode the probe features, and the latter constitutes a stack of networks that
encode the prototypes (gallery features). For each training iteration, the
gallery network, which is sequentially rotated from the stack, and the probe
network form a pair of semi-siamese networks. We give theoretical and empirical
analysis that, given the long-tail (or shallow) data and training loss, MASST
smooths the loss landscape and satisfies the Lipschitz continuity with the help
of multiple agents and the updating gallery queue. The proposed method is out
of extra-dependency, thus can be easily integrated with the existing loss
functions and network architectures. It is worth noting that, although multiple
gallery agents are employed for training, only the probe network is needed for
inference, without increasing the inference cost. Extensive experiments and
comparisons demonstrate the advantages of MASST for long-tail and shallow face
learning.
Related papers
- Scale Attention for Learning Deep Face Representation: A Study Against
Visual Scale Variation [69.45176408639483]
We reform the conv layer by resorting to the scale-space theory.
We build a novel style named SCale AttentioN Conv Neural Network (textbfSCAN-CNN)
As a single-shot scheme, the inference is more efficient than multi-shot fusion.
arXiv Detail & Related papers (2022-09-19T06:35:04Z) - On the Soft-Subnetwork for Few-shot Class Incremental Learning [67.0373924836107]
We propose a few-shot class incremental learning (FSCIL) method referred to as emphSoft-SubNetworks (SoftNet).
Our objective is to learn a sequence of sessions incrementally, where each session only includes a few training instances per class while preserving the knowledge of the previously learned ones.
We provide comprehensive empirical validations demonstrating that our SoftNet effectively tackles the few-shot incremental learning problem by surpassing the performance of state-of-the-art baselines over benchmark datasets.
arXiv Detail & Related papers (2022-09-15T04:54:02Z) - Stochastic Primal-Dual Deep Unrolling Networks for Imaging Inverse
Problems [3.7819322027528113]
We present a new type of efficient deep-unrolling networks for solving imaging inverse problems.
In our unrolling network, we only use a subset of the forward and adjoint operator.
Our numerical results demonstrate the effectiveness of our approach in X-ray CT imaging task.
arXiv Detail & Related papers (2021-10-19T16:46:03Z) - Ultra-Data-Efficient GAN Training: Drawing A Lottery Ticket First, Then
Training It Toughly [114.81028176850404]
Training generative adversarial networks (GANs) with limited data generally results in deteriorated performance and collapsed models.
We decompose the data-hungry GAN training into two sequential sub-problems.
Such a coordinated framework enables us to focus on lower-complexity and more data-efficient sub-problems.
arXiv Detail & Related papers (2021-02-28T05:20:29Z) - Unsupervised Monocular Depth Learning with Integrated Intrinsics and
Spatio-Temporal Constraints [61.46323213702369]
This work presents an unsupervised learning framework that is able to predict at-scale depth maps and egomotion.
Our results demonstrate strong performance when compared to the current state-of-the-art on multiple sequences of the KITTI driving dataset.
arXiv Detail & Related papers (2020-11-02T22:26:58Z) - HALO: Learning to Prune Neural Networks with Shrinkage [5.283963846188862]
Deep neural networks achieve state-of-the-art performance in a variety of tasks by extracting a rich set of features from unstructured data.
Modern techniques for inducing sparsity and reducing model size are (1) network pruning, (2) training with a sparsity inducing penalty, and (3) training a binary mask jointly with the weights of the network.
We present a novel penalty called Hierarchical Adaptive Lasso which learns to adaptively sparsify weights of a given network via trainable parameters.
arXiv Detail & Related papers (2020-08-24T04:08:48Z) - Semi-Siamese Training for Shallow Face Learning [78.7386209619276]
We introduce a novel training method named Semi-Siamese Training (SST)
A pair of Semi-Siamese networks constitute the forward propagation structure, and the training loss is computed with an updating gallery queue.
Our method is developed without extra-dependency, thus can be flexibly integrated with the existing loss functions and network architectures.
arXiv Detail & Related papers (2020-07-16T15:20:04Z) - Deep Multi-Facial Patches Aggregation Network For Facial Expression
Recognition [5.735035463793008]
We propose an approach for Facial Expressions Recognition (FER) based on a deep multi-facial patches aggregation network.
Deep features are learned from facial patches using deep sub-networks and aggregated within one deep architecture for expression classification.
arXiv Detail & Related papers (2020-02-20T17:57:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.