HiCL: Hierarchical Contrastive Learning of Unsupervised Sentence
Embeddings
- URL: http://arxiv.org/abs/2310.09720v1
- Date: Sun, 15 Oct 2023 03:14:33 GMT
- Title: HiCL: Hierarchical Contrastive Learning of Unsupervised Sentence
Embeddings
- Authors: Zhuofeng Wu, Chaowei Xiao, VG Vinod Vydiswaran
- Abstract summary: HiCL considers local segment-level and global sequence-level relationships to improve training efficiency and effectiveness.
In experiments, HiCL enhances the prior top-performing SNCSE model across seven extensively evaluated STS tasks.
- Score: 31.50124610417377
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose a hierarchical contrastive learning framework,
HiCL, which considers local segment-level and global sequence-level
relationships to improve training efficiency and effectiveness. Traditional
methods typically encode a sequence in its entirety for contrast with others,
often neglecting local representation learning, leading to challenges in
generalizing to shorter texts. Conversely, HiCL improves its effectiveness by
dividing the sequence into several segments and employing both local and global
contrastive learning to model segment-level and sequence-level relationships.
Further, considering the quadratic time complexity of transformers over input
tokens, HiCL boosts training efficiency by first encoding short segments and
then aggregating them to obtain the sequence representation. Extensive
experiments show that HiCL enhances the prior top-performing SNCSE model across
seven extensively evaluated STS tasks, with an average increase of +0.2%
observed on BERT-large and +0.44% on RoBERTa-large.
Related papers
- Self-Calibrated CLIP for Training-Free Open-Vocabulary Segmentation [19.749490092520006]
Self-Calibrated CLIP (SC-CLIP) is a training-free method that calibrates CLIP to produce finer-language representations.
SC-CLIP boosts the performance of vanilla CLIP ViT-L/14 by 6.8 times.
arXiv Detail & Related papers (2024-11-24T15:14:05Z) - L^2CL: Embarrassingly Simple Layer-to-Layer Contrastive Learning for Graph Collaborative Filtering [33.165094795515785]
Graph neural networks (GNNs) have recently emerged as an effective approach to model neighborhood signals in collaborative filtering.
We propose L2CL, a principled Layer-to-Layer Contrastive Learning framework that contrasts representations from different layers.
We find that L2CL, using only one-hop contrastive learning paradigm, is able to capture intrinsic semantic structures and improve the quality of node representation.
arXiv Detail & Related papers (2024-07-19T12:45:21Z) - Semantic Equitable Clustering: A Simple and Effective Strategy for Clustering Vision Tokens [57.37893387775829]
We introduce a fast and balanced clustering method, named textbfSemantic textbfEquitable textbfClustering (SEC)
SEC clusters tokens based on their global semantic relevance in an efficient, straightforward manner.
We propose a versatile vision backbone, SECViT, to serve as a vision language connector.
arXiv Detail & Related papers (2024-05-22T04:49:00Z) - RankCLIP: Ranking-Consistent Language-Image Pretraining [7.92247304974314]
RANKCLIP is a novel pretraining method that extends beyond the rigid one-to-one matching framework of CLIP.
By extending the traditional pair-wise loss to list-wise, RANKCLIP improves the alignment process, enabling it to capture the nuanced many-to-many relationships between and within each modality.
arXiv Detail & Related papers (2024-04-15T00:12:27Z) - Hyperparameters in Continual Learning: A Reality Check [53.30082523545212]
Continual learning (CL) aims to train a model on a sequence of tasks while balancing the trade-off between plasticity (learning new tasks) and stability (retaining prior knowledge)
The dominantly adopted conventional evaluation protocol for CL algorithms selects the best hyper parameters in a given scenario and then evaluates the algorithms in the same scenario.
This protocol has significant shortcomings: it overestimates the CL capacity of algorithms and relies on unrealistic hyper parameter tuning.
We argue that the evaluation of CL algorithms should focus on assessing the generalizability of their CL capacity to unseen scenarios.
arXiv Detail & Related papers (2024-03-14T03:13:01Z) - Generalized Few-Shot Continual Learning with Contrastive Mixture of
Adapters [59.82088750033897]
We set up a Generalized FSCL (GFSCL) protocol involving both class- and domain-incremental situations.
We find that common continual learning methods have poor generalization ability on unseen domains.
In this way, we propose a rehearsal-free framework based on Vision Transformer (ViT) named Contrastive Mixture of Adapters (CMoA)
arXiv Detail & Related papers (2023-02-12T15:18:14Z) - Non-Contrastive Learning Meets Language-Image Pre-Training [145.6671909437841]
We study the validity of non-contrastive language-image pre-training (nCLIP)
We introduce xCLIP, a multi-tasking framework combining CLIP and nCLIP, and show that nCLIP aids CLIP in enhancing feature semantics.
arXiv Detail & Related papers (2022-10-17T17:57:46Z) - Fine-grained Temporal Contrastive Learning for Weakly-supervised
Temporal Action Localization [87.47977407022492]
This paper argues that learning by contextually comparing sequence-to-sequence distinctions offers an essential inductive bias in weakly-supervised action localization.
Under a differentiable dynamic programming formulation, two complementary contrastive objectives are designed, including Fine-grained Sequence Distance (FSD) contrasting and Longest Common Subsequence (LCS) contrasting.
Our method achieves state-of-the-art performance on two popular benchmarks.
arXiv Detail & Related papers (2022-03-31T05:13:50Z) - HiCLRE: A Hierarchical Contrastive Learning Framework for Distantly
Supervised Relation Extraction [24.853265244512954]
We propose a hierarchical contrastive learning Framework for DistantlySupervised relation extraction (HiCLRE) to reduce noisy sentences.
Specifically, we propose a three-level hierarchical learning framework to interact with cross levels, generating the de-noising context-aware representations.
Experiments demonstrate that HiCLRE significantly outperforms strong baselines in various mainstream DSRE datasets.
arXiv Detail & Related papers (2022-02-27T12:48:26Z) - You Never Cluster Alone [150.94921340034688]
We extend the mainstream contrastive learning paradigm to a cluster-level scheme, where all the data subjected to the same cluster contribute to a unified representation.
We define a set of categorical variables as clustering assignment confidence, which links the instance-level learning track with the cluster-level one.
By reparametrizing the assignment variables, TCC is trained end-to-end, requiring no alternating steps.
arXiv Detail & Related papers (2021-06-03T14:59:59Z) - Self-supervised Document Clustering Based on BERT with Data Augment [1.0152838128195467]
We propose self-supervised contrastive learning (SCL) as well as few-shot contrastive learning (FCL) with unsupervised data augmentation (UDA) for text clustering.
SCL outperforms state-of-the-art unsupervised clustering approaches for short texts and those for long texts in terms of several clustering evaluation measures.
FCL achieves performance close to supervised learning, and FCL with UDA further improves the performance for short texts.
arXiv Detail & Related papers (2020-11-17T09:18:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.