NPCFace: Negative-Positive Collaborative Training for Large-scale Face
Recognition
- URL: http://arxiv.org/abs/2007.10172v3
- Date: Fri, 14 May 2021 11:37:18 GMT
- Title: NPCFace: Negative-Positive Collaborative Training for Large-scale Face
Recognition
- Authors: Dan Zeng, Hailin Shi, Hang Du, Jun Wang, Zhen Lei, and Tao Mei
- Abstract summary: We study how to make better use of hard samples for improving the training.
The correlation between hard positive and hard negative is overlooked, and so is the relation between the margins in positive and negative logits.
We propose a novel Negative-Positive Collaboration loss, named NPCFace, which emphasizes the training on both negative and positive hard cases.
- Score: 78.21084529159577
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The training scheme of deep face recognition has greatly evolved in the past
years, yet it encounters new challenges in the large-scale data situation where
massive and diverse hard cases occur. Especially in the range of low false
accept rate (FAR), there are various hard cases in both positives (intra-class)
and negatives (inter-class). In this paper, we study how to make better use of
these hard samples for improving the training. The literature approaches this
by margin-based formulation in either positive logit or negative logits.
However, the correlation between hard positive and hard negative is overlooked,
and so is the relation between the margins in positive and negative logits. We
find such correlation is significant, especially in the large-scale dataset,
and one can take advantage from it to boost the training via relating the
positive and negative margins for each training sample. To this end, we propose
an explicit collaboration between positive and negative margins sample-wisely.
Given a batch of hard samples, a novel Negative-Positive Collaboration loss,
named NPCFace, is formulated, which emphasizes the training on both negative
and positive hard cases via the collaborative-margin mechanism in the softmax
logits, and also brings better interpretation of negative-positive hardness
correlation. Besides, the emphasis is implemented with an improved formulation
to achieve stable convergence and flexible parameter setting. We validate the
effectiveness of our approach on various benchmarks of large-scale face
recognition, and obtain advantageous results especially in the low FAR range.
Related papers
- Curriculum Negative Mining For Temporal Networks [33.70909189731187]
Temporal networks are effective in capturing the evolving interactions of networks over time.
CurNM is a model-aware curriculum learning framework that adaptively adjusts the difficulty of negative samples.
Our method outperforms baseline methods by a significant margin.
arXiv Detail & Related papers (2024-07-24T07:55:49Z) - Negating Negatives: Alignment with Human Negative Samples via Distributional Dispreference Optimization [37.8788435790632]
Large language models (LLMs) have revolutionized the role of AI, yet pose potential social risks.
Existing methods rely on high-quality positive-negative training pairs, suffering from noisy positive responses that are barely distinguishable from negative ones.
We propose Distributional Dispreference Optimization (D$2$O), which maximizes the discrepancy between dispreferred responses and the generated non-negative ones.
arXiv Detail & Related papers (2024-03-06T03:02:38Z) - Investigating the Role of Negatives in Contrastive Representation
Learning [59.30700308648194]
Noise contrastive learning is a popular technique for unsupervised representation learning.
We focus on disambiguating the role of one of these parameters: the number of negative examples.
We find that the results broadly agree with our theory, while our vision experiments are murkier with performance sometimes even being insensitive to the number of negatives.
arXiv Detail & Related papers (2021-06-18T06:44:16Z) - Rethinking InfoNCE: How Many Negative Samples Do You Need? [54.146208195806636]
We study how many negative samples are optimal for InfoNCE in different scenarios via a semi-quantitative theoretical framework.
We estimate the optimal negative sampling ratio using the $K$ value that maximizes the training effectiveness function.
arXiv Detail & Related papers (2021-05-27T08:38:29Z) - Revisiting the Negative Data of Distantly Supervised Relation Extraction [17.00557139562208]
Distantly supervision automatically generates plenty of training samples for relation extraction.
It also incurs two major problems: noisy labels and imbalanced training data.
We propose a pipeline approach, dubbed textscReRe, that performs sentence-level relation detection then subject/object extraction.
arXiv Detail & Related papers (2021-05-21T06:44:19Z) - Scalable Personalised Item Ranking through Parametric Density Estimation [53.44830012414444]
Learning from implicit feedback is challenging because of the difficult nature of the one-class problem.
Most conventional methods use a pairwise ranking approach and negative samplers to cope with the one-class problem.
We propose a learning-to-rank approach, which achieves convergence speed comparable to the pointwise counterpart.
arXiv Detail & Related papers (2021-05-11T03:38:16Z) - Contrastive Attraction and Contrastive Repulsion for Representation
Learning [131.72147978462348]
Contrastive learning (CL) methods learn data representations in a self-supervision manner, where the encoder contrasts each positive sample over multiple negative samples.
Recent CL methods have achieved promising results when pretrained on large-scale datasets, such as ImageNet.
We propose a doubly CL strategy that separately compares positive and negative samples within their own groups, and then proceeds with a contrast between positive and negative groups.
arXiv Detail & Related papers (2021-05-08T17:25:08Z) - Doubly Contrastive Deep Clustering [135.7001508427597]
We present a novel Doubly Contrastive Deep Clustering (DCDC) framework, which constructs contrastive loss over both sample and class views.
Specifically, for the sample view, we set the class distribution of the original sample and its augmented version as positive sample pairs.
For the class view, we build the positive and negative pairs from the sample distribution of the class.
In this way, two contrastive losses successfully constrain the clustering results of mini-batch samples in both sample and class level.
arXiv Detail & Related papers (2021-03-09T15:15:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.