Multi-label Contrastive Predictive Coding
- URL: http://arxiv.org/abs/2007.09852v2
- Date: Wed, 2 Dec 2020 20:05:39 GMT
- Title: Multi-label Contrastive Predictive Coding
- Authors: Jiaming Song and Stefano Ermon
- Abstract summary: Variational mutual information (MI) estimators are widely used in unsupervised representation learning methods such as contrastive predictive coding (CPC)
We introduce a novel estimator based on a multi-label classification problem, where the critic needs to jointly identify multiple positive samples at the same time.
We show that using the same amount of negative samples, multi-label CPC is able to exceed the $log m$ bound, while still being a valid lower bound of mutual information.
- Score: 125.03510235962095
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Variational mutual information (MI) estimators are widely used in
unsupervised representation learning methods such as contrastive predictive
coding (CPC). A lower bound on MI can be obtained from a multi-class
classification problem, where a critic attempts to distinguish a positive
sample drawn from the underlying joint distribution from $(m-1)$ negative
samples drawn from a suitable proposal distribution. Using this approach, MI
estimates are bounded above by $\log m$, and could thus severely underestimate
unless $m$ is very large. To overcome this limitation, we introduce a novel
estimator based on a multi-label classification problem, where the critic needs
to jointly identify multiple positive samples at the same time. We show that
using the same amount of negative samples, multi-label CPC is able to exceed
the $\log m$ bound, while still being a valid lower bound of mutual
information. We demonstrate that the proposed approach is able to lead to
better mutual information estimation, gain empirical improvements in
unsupervised representation learning, and beat a current state-of-the-art
knowledge distillation method over 10 out of 13 tasks.
Related papers
- Regularized Contrastive Partial Multi-view Outlier Detection [76.77036536484114]
We propose a novel method named Regularized Contrastive Partial Multi-view Outlier Detection (RCPMOD)
In this framework, we utilize contrastive learning to learn view-consistent information and distinguish outliers by the degree of consistency.
Experimental results on four benchmark datasets demonstrate that our proposed approach could outperform state-of-the-art competitors.
arXiv Detail & Related papers (2024-08-02T14:34:27Z) - Boosting Few-Shot Text Classification via Distribution Estimation [38.99459686893034]
We propose two simple yet effective strategies to estimate the distributions of the novel classes by utilizing unlabeled query samples.
Specifically, we first assume a class or sample follows the Gaussian distribution, and use the original support set and the nearest few query samples.
Then, we augment the labeled samples by sampling from the estimated distribution, which can provide sufficient supervision for training the classification model.
arXiv Detail & Related papers (2023-03-26T05:58:39Z) - Rethinking Clustering-Based Pseudo-Labeling for Unsupervised
Meta-Learning [146.11600461034746]
Method for unsupervised meta-learning, CACTUs, is a clustering-based approach with pseudo-labeling.
This approach is model-agnostic and can be combined with supervised algorithms to learn from unlabeled data.
We prove that the core reason for this is lack of a clustering-friendly property in the embedding space.
arXiv Detail & Related papers (2022-09-27T19:04:36Z) - Rethinking Collaborative Metric Learning: Toward an Efficient
Alternative without Negative Sampling [156.7248383178991]
Collaborative Metric Learning (CML) paradigm has aroused wide interest in the area of recommendation systems (RS)
We find that negative sampling would lead to a biased estimation of the generalization error.
Motivated by this, we propose an efficient alternative without negative sampling for CML named textitSampling-Free Collaborative Metric Learning (SFCML)
arXiv Detail & Related papers (2022-06-23T08:50:22Z) - False membership rate control in mixture models [1.387448620257867]
A clustering task consists in partitioning elements of a sample into homogeneous groups.
In the supervised setting, this approach is well known and referred to as classification with an abstention option.
In this paper the approach is revisited in an unsupervised mixture model framework and the purpose is to develop a method that comes with the guarantee that the false membership rate does not exceed a pre-defined nominal level.
arXiv Detail & Related papers (2022-03-04T22:37:59Z) - Mixture Proportion Estimation and PU Learning: A Modern Approach [47.34499672878859]
Given only positive examples and unlabeled examples, we might hope to estimate an accurate positive-versus-negative classifier.
classical methods for both problems break down in high-dimensional settings.
We propose two simple techniques: Best Bin Estimation (BBE) and Value Ignoring Risk (CVIR)
arXiv Detail & Related papers (2021-11-01T14:42:23Z) - Binary classification with ambiguous training data [69.50862982117127]
In supervised learning, we often face with ambiguous (A) samples that are difficult to label even by domain experts.
This problem is substantially different from semi-supervised learning since unlabeled samples are not necessarily difficult samples.
arXiv Detail & Related papers (2020-11-05T00:53:58Z) - Optimal Off-Policy Evaluation from Multiple Logging Policies [77.62012545592233]
We study off-policy evaluation from multiple logging policies, each generating a dataset of fixed size, i.e., stratified sampling.
We find the OPE estimator for multiple loggers with minimum variance for any instance, i.e., the efficient one.
arXiv Detail & Related papers (2020-10-21T13:43:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.