KDMCSE: Knowledge Distillation Multimodal Sentence Embeddings with Adaptive Angular margin Contrastive Learning
- URL: http://arxiv.org/abs/2403.17486v1
- Date: Tue, 26 Mar 2024 08:32:39 GMT
- Title: KDMCSE: Knowledge Distillation Multimodal Sentence Embeddings with Adaptive Angular margin Contrastive Learning
- Authors: Cong-Duy Nguyen, Thong Nguyen, Xiaobao Wu, Anh Tuan Luu,
- Abstract summary: We propose KDMCSE, a novel approach that enhances the discrimination and generalizability of multimodal representation.
We also introduce a new contrastive objective, AdapACSE, that enhances the discriminative representation by strengthening the margin within the angular space.
- Score: 31.139620652818838
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Previous work on multimodal sentence embedding has proposed multimodal contrastive learning and achieved promising results. However, by taking the rest of the batch as negative samples without reviewing when forming contrastive pairs, those studies encountered many suspicious and noisy negative examples, significantly affecting the methods' overall performance. In this work, we propose KDMCSE (Knowledge Distillation Multimodal contrastive learning of Sentence Embeddings), a novel approach that enhances the discrimination and generalizability of multimodal representation and inherits the knowledge from the teacher model to learn the difference between positive and negative instances and via that, can detect noisy and wrong negative samples effectively before they are calculated in the contrastive objective. Furthermore, to overcome the limitation of modeling the variation within negative pairs, we introduce a new contrastive objective, AdapACSE (Adaptive Angular Margin Supervised Contrastive Learning for Multimodal sentence embeddings), that enhances the discriminative representation by strengthening the margin within the angular space while capturing varying semantics within the negative. Experimental results on widely used Semantic Textual Similarity (STS) benchmarks demonstrate the effectiveness of our approach.
Related papers
- Improving Contrastive Learning of Sentence Embeddings with Focal-InfoNCE [13.494159547236425]
This study introduces an unsupervised contrastive learning framework that combines SimCSE with hard negative mining.
The proposed focal-InfoNCE function introduces self-paced modulation terms in the contrastive objective, downweighting the loss associated with easy negatives and encouraging the model focusing on hard negatives.
arXiv Detail & Related papers (2023-10-10T18:15:24Z) - Contrastive Bayesian Analysis for Deep Metric Learning [30.21464199249958]
We develop a contrastive Bayesian analysis to characterize and model the posterior probabilities of image labels conditioned by their features similarity.
This contrastive Bayesian analysis leads to a new loss function for deep metric learning.
Our experimental results and ablation studies demonstrate that the proposed contrastive Bayesian metric learning method significantly improves the performance of deep metric learning.
arXiv Detail & Related papers (2022-10-10T02:24:21Z) - Adversarial Contrastive Learning via Asymmetric InfoNCE [64.42740292752069]
We propose to treat adversarial samples unequally when contrasted with an asymmetric InfoNCE objective.
In the asymmetric fashion, the adverse impacts of conflicting objectives between CL and adversarial learning can be effectively mitigated.
Experiments show that our approach consistently outperforms existing Adversarial CL methods.
arXiv Detail & Related papers (2022-07-18T04:14:36Z) - Unsupervised Voice-Face Representation Learning by Cross-Modal Prototype
Contrast [34.58856143210749]
We present an approach to learn voice-face representations from the talking face videos, without any identity labels.
Previous works employ cross-modal instance discrimination tasks to establish the correlation of voice and face.
We propose the cross-modal prototype contrastive learning (CMPC), which takes advantage of contrastive methods and resists adverse effects of false negatives and deviate positives.
arXiv Detail & Related papers (2022-04-28T07:28:56Z) - Modulated Contrast for Versatile Image Synthesis [60.304183493234376]
MoNCE is a versatile metric that introduces image contrast to learn a calibrated metric for the perception of multifaceted inter-image distances.
We introduce optimal transport in MoNCE to modulate the pushing force of negative samples collaboratively across multiple contrastive objectives.
arXiv Detail & Related papers (2022-03-17T14:03:46Z) - Investigating the Role of Negatives in Contrastive Representation
Learning [59.30700308648194]
Noise contrastive learning is a popular technique for unsupervised representation learning.
We focus on disambiguating the role of one of these parameters: the number of negative examples.
We find that the results broadly agree with our theory, while our vision experiments are murkier with performance sometimes even being insensitive to the number of negatives.
arXiv Detail & Related papers (2021-06-18T06:44:16Z) - Incremental False Negative Detection for Contrastive Learning [95.68120675114878]
We introduce a novel incremental false negative detection for self-supervised contrastive learning.
During contrastive learning, we discuss two strategies to explicitly remove the detected false negatives.
Our proposed method outperforms other self-supervised contrastive learning frameworks on multiple benchmarks within a limited compute.
arXiv Detail & Related papers (2021-06-07T15:29:14Z) - Solving Inefficiency of Self-supervised Representation Learning [87.30876679780532]
Existing contrastive learning methods suffer from very low learning efficiency.
Under-clustering and over-clustering problems are major obstacles to learning efficiency.
We propose a novel self-supervised learning framework using a median triplet loss.
arXiv Detail & Related papers (2021-04-18T07:47:10Z) - Contrastive Learning with Hard Negative Samples [80.12117639845678]
We develop a new family of unsupervised sampling methods for selecting hard negative samples.
A limiting case of this sampling results in a representation that tightly clusters each class, and pushes different classes as far apart as possible.
The proposed method improves downstream performance across multiple modalities, requires only few additional lines of code to implement, and introduces no computational overhead.
arXiv Detail & Related papers (2020-10-09T14:18:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.