Related papers: Partial FC: Training 10 Million Identities on a Single Machine

Partial FC: Training 10 Million Identities on a Single Machine

URL: http://arxiv.org/abs/2010.05222v2
Date: Sat, 23 Jan 2021 05:25:06 GMT
Title: Partial FC: Training 10 Million Identities on a Single Machine
Authors: Xiang An, Xuhan Zhu, Yang Xiao, Lan Wu, Ming Zhang, Yuan Gao, Bin Qin, Debing Zhang, Ying Fu
Abstract summary: We analyze the optimization goal of softmax-based loss functions and the difficulty of training massive identities. Experiment demonstrates no loss of accuracy when training with only 10% randomly sampled classes for the softmax-based loss functions. We also implement a very efficient distributed sampling algorithm, taking into account model accuracy and training efficiency.
Score: 23.7030637489807
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Face recognition has been an active and vital topic among computer vision community for a long time. Previous researches mainly focus on loss functions used for facial feature extraction network, among which the improvements of softmax-based loss functions greatly promote the performance of face recognition. However, the contradiction between the drastically increasing number of face identities and the shortage of GPU memories is gradually becoming irreconcilable. In this paper, we thoroughly analyze the optimization goal of softmax-based loss functions and the difficulty of training massive identities. We find that the importance of negative classes in softmax function in face representation learning is not as high as we previously thought. The experiment demonstrates no loss of accuracy when training with only 10\% randomly sampled classes for the softmax-based loss functions, compared with training with full classes using state-of-the-art models on mainstream benchmarks. We also implement a very efficient distributed sampling algorithm, taking into account model accuracy and training efficiency, which uses only eight NVIDIA RTX2080Ti to complete classification tasks with tens of millions of identities. The code of this paper has been made available https://github.com/deepinsight/insightface/tree/master/recognition/partial_fc.

Related papers

Newton Losses: Using Curvature Information for Learning with Differentiable Algorithms [80.37846867546517]
We show how to train eight different neural networks with custom objectives. We exploit their second-order information via their empirical Fisherssian matrices. We apply Loss Lossiable algorithms to achieve significant improvements for less differentiable algorithms.
arXiv Detail & Related papers (2024-10-24T18:02:11Z)
X2-Softmax: Margin Adaptive Loss Function for Face Recognition [6.497884034818003]
We propose a new angular margin loss named X2-Softmax. X2-Softmax loss has adaptive angular margins, which provide the margin that increases with the angle between different classes growing. We have trained the neural network with X2-Softmax loss on the MS1Mv3 dataset.
arXiv Detail & Related papers (2023-12-08T10:27:47Z)
Toward High Quality Facial Representation Learning [58.873356953627614]
We propose a self-supervised pre-training framework, called Mask Contrastive Face (MCF) We use feature map of a pre-trained visual backbone as a supervision item and use a partially pre-trained decoder for mask image modeling. Our model achieves 0.932 NME_diag$ for AFLW-19 face alignment and 93.96 F1 score for LaPa face parsing.
arXiv Detail & Related papers (2023-09-07T09:11:49Z)
SubFace: Learning with Softmax Approximation for Face Recognition [3.262192371833866]
SubFace is a softmax approximation method that employs the subspace feature to promote the performance of face recognition. Comprehensive experiments conducted on benchmark datasets demonstrate that our method can significantly improve the performance of vanilla CNN baseline.
arXiv Detail & Related papers (2022-08-24T12:31:08Z)
Meta Balanced Network for Fair Face Recognition [51.813457201437195]
We systematically and scientifically study bias from both data and algorithm aspects. We propose a novel meta-learning algorithm, called Meta Balanced Network (MBN), which learns adaptive margins in large margin loss. Extensive experiments show that MBN successfully mitigates bias and learns more balanced performance for people with different skin tones in face recognition.
arXiv Detail & Related papers (2022-05-13T10:25:44Z)
Distinction Maximization Loss: Efficiently Improving Classification Accuracy, Uncertainty Estimation, and Out-of-Distribution Detection Simply Replacing the Loss and Calibrating [2.262407399039118]
We propose training deterministic deep neural networks using our DisMax loss. DisMax usually outperforms all current approaches simultaneously in classification accuracy, uncertainty estimation, inference efficiency, and out-of-distribution detection.
arXiv Detail & Related papers (2022-05-12T04:37:35Z)
ElasticFace: Elastic Margin Loss for Deep Face Recognition [6.865656740940772]
Learning discriminative face features plays a major role in building high-performing face recognition models. Recent state-of-the-art face recognition solutions proposed to incorporate a fixed penalty margin on classification loss function, softmax loss. We propose elastic margin loss (ElasticFace) that allows flexibility in the push for class separability.
arXiv Detail & Related papers (2021-09-20T10:31:50Z)
Frequency-aware Discriminative Feature Learning Supervised by Single-Center Loss for Face Forgery Detection [89.43987367139724]
Face forgery detection is raising ever-increasing interest in computer vision. Recent works have reached sound achievements, but there are still unignorable problems. A novel frequency-aware discriminative feature learning framework is proposed in this paper.
arXiv Detail & Related papers (2021-03-16T14:17:17Z)
Loss Function Search for Face Recognition [75.79325080027908]
We develop a reward-guided search method to automatically obtain the best candidate. Experimental results on a variety of face recognition benchmarks have demonstrated the effectiveness of our method.
arXiv Detail & Related papers (2020-07-10T03:40:10Z)
Taming GANs with Lookahead-Minmax [63.90038365274479]
Experimental results on MNIST, SVHN, CIFAR-10, and ImageNet demonstrate a clear advantage of combining Lookahead-minmax with Adam or extragradient. Using 30-fold fewer parameters and 16-fold smaller minibatches we outperform the reported performance of the class-dependent BigGAN on CIFAR-10 by obtaining FID of 12.19 without using the class labels.
arXiv Detail & Related papers (2020-06-25T17:13:23Z)
More Information Supervised Probabilistic Deep Face Embedding Learning [10.52667214402514]
We analyse margin based softmax loss in probability view. An auto-encoder architecture called Linear-Auto-TS-Encoder(LATSE) is proposed to corroborate this finding.
arXiv Detail & Related papers (2020-06-08T12:33:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.